Re: [ceph-users] Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)
On Sat, Jul 27, 2019 at 06:08:58PM +0530, Ajitha Robert wrote: > *1) Will there be any folder related to rbd-mirroring in /var/lib/ceph ? * no > *2) Is ceph rbd-mirror authentication mandatory?* no. But why are you asking? > *3)when even i create any cinder volume loaded with glance image i get the > following error.. * > > 2019-07-27 17:26:46.762571 7f93eb0a5780 20 librbd::api::Mirror: peer_list: > 2019-07-27 17:27:07.541701 7f939d7fa700 0 rbd::mirror::ImageReplayer: > 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down: > remote image no longer exists: scheduling deletion > 2019-07-27 17:27:16.766199 7f93eb0a5780 20 librbd::api::Mirror: peer_list: > 2019-07-27 17:27:22.568970 7f939d7fa700 0 rbd::mirror::ImageReplayer: > 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down: > mirror image no longer exists > 2019-07-27 17:27:46.769158 7f93eb0a5780 20 librbd::api::Mirror: peer_list: > 2019 The log tells that the primary image was deleted by some reason and the rbd-mirror scheduled the secondary (mirrored) image deletion. From the logs it is not seen why the primary image was deleted. It might be sinder but can't exlude some bug in the rbd-mirror, running on the primary cluster, though I don't recall any issues like this. > *Attimes i can able to create bootable cinder volume apart from the above > errors, but certain times i face the following > > example, For a 50 gb volume, Local image get created, but it couldnt create > a mirror image "Connection timed out" errors suggest you have a connectivity issue between sites? -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)
On Fri, Jul 26, 2019 at 04:40:35PM +0530, Ajitha Robert wrote: > Thank you for the clarification. > > But i was trying with openstack-cinder.. when i load some data into the > volume around 50gb, the image sync will stop by 5 % or something within > 15%... What could be the reason? I suppose you see image sync stop in mirror status output? Could you please provide an example? And I suppose you don't see any other messages in rbd-mirror log apart from what you have already posted? Depending on configuration rbd-mirror might log in several logs. Could you please try to find all its logs? `lsof |grep 'rbd-mirror.*log'` may be useful for this. BTW, what rbd-mirror version are you running? -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)
On Fri, Jul 26, 2019 at 12:31:59PM +0530, Ajitha Robert wrote: > I have a rbd mirroring setup with primary and secondary clusters as peers > and I have a pool enabled image mode.., In this i created a rbd image , > enabled with journaling. > But whenever i enable mirroring on the image, I m getting error in > rbdmirror.log and osd.log. > I have increased the timeouts.. nothing worked and couldnt traceout the > error > please guide me to solve this error. > > *Logs* > http://paste.openstack.org/show/754766/ What do you mean by "nothing worked"? According to mirroring status the image is mirroring: it is in "up+stopped" state on the primary as expected, and in "up+replaying" state on the secondary with 0 entries behind master. The "failed to get omap key" error in the osd log is harmless, and just a week ago the fix was merged upstream not to display it. The cause of "InstanceWatcher: ... resending after timeout" error in the rbd-mirror log is not clear but if it is not repeating it is harmless too. I see you were trying to map the image with krbd. It is expected to fail as the krbd does not support "journaling" feature, which is necessary for mirroring. You can access those images only with librbd (e.g. mapping with rbd-nbd driver or via qemu). -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD image format v1 EOL ...
On Fri, Feb 22, 2019 at 02:43:36PM +0200, koukou73gr wrote: > On 2019-02-20 17:38, Mykola Golub wrote: > > > Note, if even rbd supported live (without any downtime) migration you > > would still need to restart the client after the upgrate to a new > > librbd with migration support. > > > > > You could probably get away with executing the client with a new librbd > version by live migrating the VM to an updated hypervisor. > > At least, this is what I have been doing so far when updating Ceph client > libraries having zero downtime. Yes, and this is what I meant when I was writing about investigating a possiblilty to migrate to the new format with zero downtime by using VM live migration. If there were a way to execute "rbd migration prepare $pool/$image" during live VM migration, exactly after the source VM closes the rbd image but before the destination VM opens it, it would do the trick (the destination VM would start using the new format). Right now I don't know if it is possible at all, because I am not familiar with VM migration, e.g. I don't know if it opens the image on the destination before or after it closes it on the source. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD image format v1 EOL ...
On Wed, Feb 20, 2019 at 10:22:47AM +0100, Jan Kasprzak wrote: > If I read the parallel thread about pool migration in ceph-users@ > correctly, the ability to migrate to v2 would still require to stop the client > before the "rbd migration prepare" can be executed. Note, if even rbd supported live (without any downtime) migration you would still need to restart the client after the upgrate to a new librbd with migration support. So actually you can combine the upgrade with migration: upgrade client library stop client rbd migration prepare start client and eventually: rbd migration execute rbd migration commit And it would be interesting to investigate a possibility to replace "stop/start client" steps with "migrating the VM to another (upgraded) host" to avoid stopping the VM at all. The trick would be to execute somehow "rbd migration prepare" after the the sourse VM closes the image, but before the destination VM opens it. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] read-only mounts of RBD images on multiple nodes for parallel reads
On Tue, Jan 22, 2019 at 01:26:29PM -0800, Void Star Nill wrote: > Regarding Mykola's suggestion to use Read-Only snapshots, what is the > overhead of creating these snapshots? I assume these are copy-on-write > snapshots, so there's no extra space consumed except for the metadata? Yes. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10
On Fri, Jan 18, 2019 at 11:06:54AM -0600, Mark Nelson wrote: > IE even though you guys set bluestore_cache_size to 1GB, it is being > overridden by bluestore_cache_size_ssd. Isn't it vice versa [1]? [1] https://github.com/ceph/ceph/blob/luminous/src/os/bluestore/BlueStore.cc#L3976 -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] read-only mounts of RBD images on multiple nodes for parallel reads
On Thu, Jan 17, 2019 at 10:27:20AM -0800, Void Star Nill wrote: > Hi, > > We am trying to use Ceph in our products to address some of the use cases. > We think Ceph block device for us. One of the use cases is that we have a > number of jobs running in containers that need to have Read-Only access to > shared data. The data is written once and is consumed multiple times. I > have read through some of the similar discussions and the recommendations > on using CephFS for these situations, but in our case Block device makes > more sense as it fits well with other use cases and restrictions we have > around this use case. > > The following scenario seems to work as expected when we tried on a test > cluster, but we wanted to get an expert opinion to see if there would be > any issues in production. The usage scenario is as follows: > > - A block device is created with "--image-shared" options: > > rbd create mypool/foo --size 4G --image-shared "--image-shared" just means that the created image will have "exclusive-lock" feature and all other features that depend on it disabled. It is useful for scenarios when one wants simulteous write access to the image (e.g. when using a shared-disk cluster fs like ocfs2) and does not want a performance penalty due to "exlusive-lock" being pinged-ponged between writers. For your scenario it is not necessary but is ok. > - The image is mapped to a host, formatted in ext4 format (or other file > formats), mounted to a directory in read/write mode and data is written to > it. Please note that the image will be mapped in exclusive write mode -- no > other read/write mounts are allowed a this time. The map "exclusive" option works only for images with "exclusive-lock" feature enabled and prevent in this case automatic exclusive lock transitions (ping-pong mentioned above) from one writer to another. And in this case it will not prevent from mapping and mounting it ro and probably even rw (I am not familiar enough with kernel rbd implementation to be sure here), though in the last case the write will fail. > - The volume is unmapped from the host and then mapped on to N number of > other hosts where it will be mounted in read-only mode and the data is read > simultaneously from N readers > > As mentioned above, this seems to work as expected, but we wanted to > confirm that we won't run into any unexpected issues. It should work. Although as you can see rbd hardly protects simultaneous access in this case so it should be carefully organized on higher level. But you may consider creating a snapshot after modifying the image and mapping and mounting the snapshot on readers. This way you even can modify the image without unmounting the readers and then remap/remount the new snapshot. And you will have a rollback option as a gratis. Also there is a valid concern mentioned by others about ext4 might want to flush the journal if it is not clean even when mounting ro. I expect the mount will just fail in this case because the image is mapped ro, but you might want to investigate how to improve this. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object
On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote: > I reconfigure the osd service from start, the journal was: I am not quite sure I understand what you mean here. > -- > > -- Unit ceph-osd@0.service has finished starting up. > -- > -- The start-up result is RESULT. > Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80 > -1 Public network was set, but cluster network was not set > Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80 > -1 Using public network also for cluster network > Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data > /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal > Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365 7f6a27204e80 > -1 journal FileJournal::_open: disabling aio for non-block journal. Use > journal_force_aio to force use of a> > Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414 7f6a27204e80 > -1 journal do_read_entry(6930432): bad header magic > Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729 7f6a27204e80 > -1 osd.0 21 log_to_monitors {default=true} > Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for check of > host 'localhost' was out of bounds. > > -- Could you please post the full ceph-osd log somewhere? /var/log/ceph/ceph-osd.0.log > but hang at the command: "rbd create libvirt-pool/dimage --size 10240 " So it hungs forever now instead of returning the error? What is `ceph -s` output? -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object
On Mon, Nov 05, 2018 at 06:14:09PM +0800, Dengke Du wrote: > -1 osd.0 20 class rbd open got (2) No such file or directory So rbd cls was not loaded. Look at the directory, returned by this command: ceph-conf --name osd.0 -D | grep osd_class_dir if it contains libcls_rbd.so. And if the list returned by this command: ceph-conf --name osd.0 -D | grep osd_class_load_list contains rbd. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object
On Mon, Nov 05, 2018 at 03:19:29PM +0800, Dengke Du wrote: > Hi all > > ceph: 13.2.2 > > When run command: > > rbd create libvirt-pool/dimage --size 10240 > > Error happen: > > rbd: create error: 2018-11-04 23:54:56.224 7ff22e7fc700 -1 > librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error > creating RBD id object: (95) Operation not supported > (95) Operation not supported Most likely the libcls_rbd.so pluging was not loaded by the osd by some reason (not found?). Could you please try grepping the osd logs for cls errors? You will probably need to restart an osd to get some fresh ones on the osd start. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd-nbd map question
Vikas, could you tell what version do you observe this on? Because I can reproduce this only on jewel, and it has been fixed starting since luminous 12.2.1 [1]. [1] http://tracker.ceph.com/issues/20426 On Wed, Sep 19, 2018 at 03:48:44PM -0400, Jason Dillaman wrote: > Thanks for reporting this -- it looks like we broke the part where > command-line config overrides were parsed out from the parsing. I've > opened a tracker ticket against the issue [1]. > > On Wed, Sep 19, 2018 at 2:49 PM Vikas Rana wrote: > > > > Hi there, > > > > With default cluster name "ceph" I can map rbd-nbd without any issue. > > > > But for a different cluster name, i'm not able to map image using rbd-nbd > > and getting > > > > root@vtier-P-node1:/etc/ceph# rbd-nbd --cluster cephdr map test-pool/testvol > > rbd-nbd: unknown command: --cluster > > > > > > I looked at the man page and the syntax looks right. > > Can someone please help me on what I'm doing wrong? > > > > Thanks, > > -Vikas > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > [1] http://tracker.ceph.com/issues/36089 > > -- > Jason > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks
On Tue, Apr 10, 2018 at 11:14:58PM -0400, Alex Gorbachev wrote: > So Josef fixed the one issue that enables e.g. lsblk and sysfs size to > reflect the correct siz on change. However, partptobe and parted > still do not detect the change, complete unmap and remap of rbd-nbd > device and remount of the filesystem is required. Does your rbd-nbd include this fix [1], targeted for v12.2.3? [1] http://tracker.ceph.com/issues/22172 -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks
On Sat, Mar 10, 2018 at 08:25:15PM -0500, Alex Gorbachev wrote: > I am running into the problem described in > https://lkml.org/lkml/2018/2/19/565 and > https://tracker.ceph.com/issues/23137 > > I went ahead and built a custom kernel reverting the change > https://github.com/torvalds/linux/commit/639812a1ed9bf49ae2c026086fbf975339cd1eef > > After that a resize shows in lsblk and /sys/block/nbdX/size, but not > in parted for a mounted filesystem. > > Unmapping and remapping the NBD device shows the size in parted. Note 639812a is only a part of changes. The more invasive changes are in 29eaadc [1]. For me the most suspicious looks removing bd_set_size() in nbd_size_update(), but this is just a wild guess. I would recommend to contact the authors of the change. This would also be a gentle remainder for Josef that he promised to fix this. [1] https://github.com/torvalds/linux/commit/29eaadc0364943b6352e8994158febcb699c9f9b -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Missing clones
On Mon, Feb 19, 2018 at 10:17:55PM +0100, Karsten Becker wrote: > BTW - how can I find out, which RBDs are affected by this problem. Maybe > a copy/remove of the affected RBDs could help? But how to find out to > which RBDs this PG belongs to? In this case rbd_data.966489238e1f29.250b looks like the problem object. To find out which RBD image it belongs to you can run `rbd info /` command for every image in the pool, looking at block_name_prefix field, until you find 'rbd_data.966489238e1f29'. > > Best > Karsten > > On 19.02.2018 19:26, Karsten Becker wrote: > > Hi. > > > > Thank you for the tip. I just tried... but unfortunately the import aborts: > > > >> Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head# > >> snapset 0=[]:{} > >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18# > >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24# > >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head# > >> snapset 628=[24,21,17]:{18=[17],24=[24,21]} > >> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void > >> SnapMapper::add_oid(const hobject_t&, const std::set&, > >> MapCacher::Transaction<std::__cxx11::basic_string, > >> ceph::buffer::list>*)' thread 7facba7de400 time 2018-02-19 19:24:18.917515 > >> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED > >> assert(r == -2) > >> ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous > >> (stable) > >> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > >> const*)+0x102) [0x7facb0c2a8f2] > >> 2: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, > >> std::less, std::allocator > const&, > >> MapCacher::Transaction<std::__cxx11::basic_string<char, > >> std::char_traits, std::allocator >, > >> ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9] > >> 3: (get_attrs(ObjectStore*, coll_t, ghobject_t, > >> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, > >> SnapMapper&)+0xafb) [0x55eef35f901b] > >> 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, > >> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) > >> [0x55eef35f9ae8] > >> 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, > >> std::__cxx11::basic_string<char, std::char_traits, > >> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5] > >> 6: (main()+0x3909) [0x55eef3561349] > >> 7: (__libc_start_main()+0xf1) [0x7facae0892b1] > >> 8: (_start()+0x2a) [0x55eef35e901a] > >> NOTE: a copy of the executable, or `objdump -rdS ` is needed > >> to interpret this. > >> *** Caught signal (Aborted) ** > >> in thread 7facba7de400 thread_name:ceph-objectstor > >> ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous > >> (stable) > >> 1: (()+0x913f14) [0x55eef3c10f14] > >> 2: (()+0x110c0) [0x7facaf5020c0] > >> 3: (gsignal()+0xcf) [0x7facae09bfcf] > >> 4: (abort()+0x16a) [0x7facae09d3fa] > >> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char > >> const*)+0x28e) [0x7facb0c2aa7e] > >> 6: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, > >> std::less, std::allocator > const&, > >> MapCacher::Transaction<std::__cxx11::basic_string<char, > >> std::char_traits, std::allocator >, > >> ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9] > >> 7: (get_attrs(ObjectStore*, coll_t, ghobject_t, > >> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, > >> SnapMapper&)+0xafb) [0x55eef35f901b] > >> 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, > >> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) > >> [0x55eef35f9ae8] > >> 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, > >> std::__cxx11::basic_string<char, std::char_traits, > >> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5] > >> 10: (main()+0x3909) [0x55eef3561349] > >> 11: (__libc_start_main()+0xf1) [0x7facae0892b1] > >> 12: (_start()+0x2a) [0x55eef35e901a] > >> Aborted -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd-fuse performance
On Tue, Jun 27, 2017 at 07:17:22PM -0400, Daniel K wrote: > rbd-nbd isn't good as it stops at 16 block devices (/dev/nbd0-15) modprobe nbd nbds_max=1024 Or, if nbd module is loaded by rbd-nbd, use --nbds_max command line option. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cannot open /dev/xvdb: Input/output error
On Mon, Jun 26, 2017 at 07:12:31PM +0200, Massimiliano Cuttini wrote: > >In your case (rbd-nbd) this error is harmless. You can avoid them > >setting in ceph.conf, [client] section something like below: > > > > admin socket = /var/run/ceph/$name.$pid.asok > > > >Also to make every rbd-nbd process to log to a separate file you can > >set (in [client] section): > > > > log file = /var/log/ceph/$name.$pid.log > I need to create all the user in ceph cluster before use this. > At the moment all the cluster was runnig with ceph admin keyring. > However, this is not an issue, I can rapidly deploy all user > >needed. I don't understand about this. I think just adding these parameters to ceph.conf should work. > > >>root 12610 0.0 0.2 1836768 11412 ? Sl Jun23 0:43 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-602b05be-395d-442e-bd68-7742deaf97bd > >> --name client.admin > >>root 17298 0.0 0.2 1644244 8420 ?Sl 21:15 0:01 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-3e16395d-7dad-4680-a7ad-7f398da7fd9e > >> --name client.admin > >>root 18116 0.0 0.2 1570512 8428 ?Sl 21:15 0:01 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-41a76fe7-c9ff-4082-adb4-43f3120a9106 > >> --name client.admin > >>root 19063 0.1 1.3 2368252 54944 ? Sl 21:15 0:10 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-6da2154e-06fd-4063-8af5-ae86ae61df50 > >> --name client.admin > >>root 21007 0.0 0.2 1570512 8644 ?Sl 21:15 0:01 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-c8aca7bd-1e37-4af4-b642-f267602e210f > >> --name client.admin > >>root 21226 0.0 0.2 1703640 8744 ?Sl 21:15 0:01 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-cf2139ac-b1c4-404d-87da-db8f992a3e72 > >> --name client.admin > >>root 21615 0.5 1.4 2368252 60256 ? Sl 21:15 0:33 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-acb2a9b0-e98d-474e-aa42-ed4e5534ddbe > >> --name client.admin > >>root 21653 0.0 0.2 1703640 11100 ? Sl 04:12 0:14 rbd-nbd > >>--nbds_max 64 map > >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-8631ab86-c85c-407b-9e15-bd86e830ba74 > >> --name client.admin > >Do you observe the issue for all these volumes? I see many of them > >were started recently (21:15) while other are older. > Only some of them. > But it's randomly. > Some of old and some just plugged becomes unavailable to xen. Do you mean by "unavailable" that image is corrupted or does it reports IO errors? If this is the first case then it was corrupted some time ago and we would need logs for that period to understand what happened. > >Don't you observe sporadic crashes/restarts of rbd-nbd processes? You > >can associate a nbd device with rbd-nbd process (and rbd volume) > >looking at /sys/block/nbd*/pid and ps output. > I really don't know where to look for the rbd-nbd log. > Can you point it out? According to some of your previous messages rbd-nbd is writing to /var/log/ceph/client.log: > Under /var/log/ceph/client.log > I see this error: > > 2017-06-25 05:25:32.833202 7f658ff04e00 0 ceph version 10.2.7 > (50e863e0f4bc8f4b9e31156de690d765af245185), process rbd-nbd, pid 8524 You could look for errors in older log files if they are rotated. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cannot open /dev/xvdb: Input/output error
On Sun, Jun 25, 2017 at 11:28:37PM +0200, Massimiliano Cuttini wrote: > > Il 25/06/2017 21:52, Mykola Golub ha scritto: > >On Sun, Jun 25, 2017 at 06:58:37PM +0200, Massimiliano Cuttini wrote: > >>I can see the error even if I easily run list-mapped: > >> > >># rbd-nbd list-mapped > >>/dev/nbd0 > >>2017-06-25 18:49:11.761962 7fcdd9796e00 -1 asok(0x7fcde3f72810) > >> AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed > >> to bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': > >> (17) File exists/dev/nbd1 > >"AdminSocket::bind_and_listen: failed to bind" errors are harmless, > >you can safely ignore them (or configure admin_socket in ceph.conf to > >avoid names collisions). > I read around that this can lead to a lock in the opening. > http://tracker.ceph.com/issues/7690 > If the daemon exists than you have to wait that it ends its operation before > you can connect. In your case (rbd-nbd) this error is harmless. You can avoid them setting in ceph.conf, [client] section something like below: admin socket = /var/run/ceph/$name.$pid.asok Also to make every rbd-nbd process to log to a separate file you can set (in [client] section): log file = /var/log/ceph/$name.$pid.log > root 12610 0.0 0.2 1836768 11412 ? Sl Jun23 0:43 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-602b05be-395d-442e-bd68-7742deaf97bd > --name client.admin > root 17298 0.0 0.2 1644244 8420 ?Sl 21:15 0:01 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-3e16395d-7dad-4680-a7ad-7f398da7fd9e > --name client.admin > root 18116 0.0 0.2 1570512 8428 ?Sl 21:15 0:01 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-41a76fe7-c9ff-4082-adb4-43f3120a9106 > --name client.admin > root 19063 0.1 1.3 2368252 54944 ? Sl 21:15 0:10 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-6da2154e-06fd-4063-8af5-ae86ae61df50 > --name client.admin > root 21007 0.0 0.2 1570512 8644 ?Sl 21:15 0:01 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-c8aca7bd-1e37-4af4-b642-f267602e210f > --name client.admin > root 21226 0.0 0.2 1703640 8744 ?Sl 21:15 0:01 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-cf2139ac-b1c4-404d-87da-db8f992a3e72 > --name client.admin > root 21615 0.5 1.4 2368252 60256 ? Sl 21:15 0:33 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-acb2a9b0-e98d-474e-aa42-ed4e5534ddbe > --name client.admin > root 21653 0.0 0.2 1703640 11100 ? Sl 04:12 0:14 rbd-nbd > --nbds_max 64 map > RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-8631ab86-c85c-407b-9e15-bd86e830ba74 > --name client.admin Do you observe the issue for all these volumes? I see many of them were started recently (21:15) while other are older. Don't you observe sporadic crashes/restarts of rbd-nbd processes? You can associate a nbd device with rbd-nbd process (and rbd volume) looking at /sys/block/nbd*/pid and ps output. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cannot open /dev/xvdb: Input/output error
On Sun, Jun 25, 2017 at 06:58:37PM +0200, Massimiliano Cuttini wrote: > I can see the error even if I easily run list-mapped: > ># rbd-nbd list-mapped >/dev/nbd0 >2017-06-25 18:49:11.761962 7fcdd9796e00 -1 asok(0x7fcde3f72810) > AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to > bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': (17) > File exists/dev/nbd1 "AdminSocket::bind_and_listen: failed to bind" errors are harmless, you can safely ignore them (or configure admin_socket in ceph.conf to avoid names collisions). Don't you see other errors? What is output for `ps auxww |grep rbd-nbd`? As the first step you could try to export images to file using `rbd export`, see if it succeeds and probably investigate the content. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Help needed rbd feature enable
e 'VHD-4c7ebb38-b081-48da-9b57-aac14bdf88c4': > >>> size 102400 MB in 51200 objects > >>> order 21 (2048 kB objects) > >>> block_name_prefix: rbd_data.5fde2ae8944a > >>> format: 2 > >>> features: > >>> flags: > >>> > >>>try to enabling them will get this error: > >>> > >>>rbd: failed to update image features: (22) Invalid argument > >>>2017-06-23 21:20:03.748746 7fdec1b34d80 -1 librbd: cannot > >>> update immutable features > >>> > >>>I read on the guide I shoulded had place in the > >>>config|rbd_default_features| > >>> > >>>What can I do now to enable this feature all feature of > >>>jewel on all images? > >>>Can I insert all the feature of jewel or is there any issue > >>>with old kernel? > >>> > >>>| > >>>| > >>> > >>>|Thanks, > >>>Max > >>>| > >>> > >>>___ > >>>ceph-users mailing list > >>>ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > >>>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >> > > > > > > > >___ > >ceph-users mailing list > >ceph-users@lists.ceph.com > >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] removing 'rados cppool' command
On Fri, May 06, 2016 at 03:41:34PM -0400, Sage Weil wrote: > This PR > > https://github.com/ceph/ceph/pull/8975 > > removes the 'rados cppool' command. The main problem is that the command > does not make a faithful copy of all data because it doesn't preserve the > snapshots (and snapshot related metadata). That means if you copy an RBD > pool it will render the images somewhat broken (snaps won't be present and > won't work properly). It also doesn't preserve the user_version field > that some librados users may rely on. > > Since it's obscure and of limited use, this PR just removes it. Copying a pool sometimes is useful, even with those limitations. Until there is an alternative way to do the same I would not remove this. A better approach to me is to move this functionality to something like `ceph_radostool` (people use such tools only when facing extraordinary situations so they are more careful and expect limitations). -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD image mounted by command "rbd-nbd" the status is read-only.
On Mon, Apr 25, 2016 at 08:09:54PM +0200, Ilya Dryomov wrote: > On Mon, Apr 25, 2016 at 7:47 PM, Stefan Lissmats <ste...@trimmat.se> wrote: > > Hello again! > > > > I understand that it's not recommended running osd and rbd-nbd on the same > > host and i actually moved my rbd-nbd to a completely clean host (same > > kernel and OS though), but with same result. > > > > I hope someone can resolve this and you seem to indicate it is some kind of > > known error but i didn't really understand the github commit that you > > linked. > > Yes, it is a bug. rbd-nbd code expects writes to have rval (return > code) equal to the size of the write. I'm pretty sure that's wrong, > because rval for writes should be 0 or a negative error. > > I think what happens is your writes complete successfully, but rbd-nbd > then throws an -EIO to the kernel because 0 != write size. I could be > wrong, so let's wait for Mykola to chime in - he added that check to > fix discards. Sorry for delay (I missed this thread due to a wrong filter). I don't recall details but I think I had an impression that on success aio_write completion returned the number of bytes written. I might be confused by this test that checks for r >= 0: https://github.com/ceph/ceph/blob/master/src/test/librbd/test_librbd.cc#L1254 Now, looking at it again, it is certainly not true and my patch is wrong. I see the fix is already requested: https://github.com/ceph/ceph/pull/8775/ Thanks. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] incomplete pg, recovery some data
On Thu, Jun 18, 2015 at 01:24:38PM +0200, Mateusz Skała wrote: Hi, After some hardware errors one of pg in our backup server is 'incomplete'. I do export pg without problems like here: https://ceph.com/community/incomplete-pgs-oh-my/ After remove pg from all osd's and import pg to one of osd pg is still 'incomplete'. I want to recover only some pice of data from this rbd so if I lost something then nothing happened. How can I tell ceph to accept this pg as complete and clean? I have a patch for ceph-objectstore-tool, which adds mark-complete operation, as it has been suggested by Sam in http://tracker.ceph.com/issues/10098 https://github.com/ceph/ceph/pull/5031 It has not been reviewed yet and not tested well though because I don't know a simple way how to get an incomplete pg. You might want to try it on your own risk. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-osd - No Longer Creates osd.X upon Launch - Bug ?
On Sun, Feb 15, 2015 at 5:39 PM, Sage Weil s...@newdream.net wrote: On Sun, 15 Feb 2015, Mykola Golub wrote: The ceph osd create could be extended to have OSD ID as a second optional argument (the first is already used for uuid). ceph osd create uuid id The command would succeed only if the ID were not in use. Ron, would this work for you? I have a patch as a proof of concept: https://github.com/trociny/ceph/compare/wip-osd_create This looks reasonable to me! Do you mind adding a few test cases in qa/workunits/cephtool/test.sh to go along with it? https://github.com/ceph/ceph/pull/3743/commits -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-osd - No Longer Creates osd.X upon Launch - Bug ?
On Thu, Feb 05, 2015 at 08:33:39AM -0700, Ron Allred wrote: Hello, The latest ceph-osd in Firefly v0.80.8, no longer auto creates its osd.X entry, in the osd map, which it was assigned via ceph.conf. I am very aware documentation states ceph osd create, can do this job, but this command only assigns the next sequential osd.X number. This is highly undesirable. For _years_, we have assigned number ranges to each OSD server for an organized multi-tier (SSD / SAS / SATA) crush map. (leaving gaps in osd numbering, naturally.) Skipping 'ceph osd create' entirely. We are now facing a problem that an OSD remove+replace, now can't use it's former osd.X ID. Making a huge mess of documentation, number patterning, and disk labeling. Is there a work-around to forcefully create an osd.X number?? The ceph osd create could be extended to have OSD ID as a second optional argument (the first is already used for uuid). ceph osd create uuid id The command would succeed only if the ID were not in use. Ron, would this work for you? I have a patch as a proof of concept: https://github.com/trociny/ceph/compare/wip-osd_create -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-osd - No Longer Creates osd.X upon Launch - Bug ?
On Sun, Feb 15, 2015 at 06:24:45PM -0800, Sage Weil wrote: On Sun, 15 Feb 2015, Gregory Farnum wrote: On Sun, Feb 15, 2015 at 5:39 PM, Sage Weil s...@newdream.net wrote: On Sun, 15 Feb 2015, Mykola Golub wrote: https://github.com/trociny/ceph/compare/wip-osd_create This looks reasonable to me! Do you mind adding a few test cases in qa/workunits/cephtool/test.sh to go along with it? Will do. Thanks. Usual disclaimer: we discourage getting creative with the osd ids because they are allocated as an *array* in memory, so skipping entries consumes some extra memory.. this can become significant if there are large gaps and/or clusters are large. These options used to exist and were removed quite deliberately. I don't remember the entire conversation at this point but we'll need to find and address the concerns raised then before reintroducing the ability to explicitly set OSD IDs. IIRC I was on the losing end of this, because it's definitely behavior we should be offering to admins, but the issues were significant enough we had to eliminate the option. Methods of preserving the user-facing utility like adding OSD names were deemed too difficult to implement. :( (I think it largely had to do with serious issues over the availability and location of data when OSDs disappear, but new ones with the same ID are present. And what you do when somebody then resurrects the original OSDs. But there might have been other things too.) The part I remember was just that 'ceph osd create id' wasn't a use and idempotent command. I don't think reusing ids is the problem, though if it is then it is still a problem since osd create will re-use the first available id. I think all this option lets us do that we didn't before is leave gaps in the id space? I think so too -- leaving gaps should be the only difference comparing to what we already have. I was wandering though if I might need to do something with those nonexistent IDs in the gap, e.g. pre-allocating them explicitly with some flags combination. It looks like I don't... -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote: Hi Bruce, you can also look on the mon, like ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep cache rbd cache is a client setting, so you have to check this connecting to the client admin socket. Its location is defined in ceph.conf, [client] section, admin socket parameter. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd tree to show primary-affinity value
On Thu, Dec 25, 2014 at 03:57:15PM +1100, Dmitry Smirnov wrote: Please don't withhold this improvement -- go ahead and submit pull request to let developers decide whether they want this or not. IMHO it is a very useful improvement. Thank you very much for implementing it. Done. https://github.com/ceph/ceph/pull/3254 -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] osd tree to show primary-affinity value
Hi, I stumbled upon this feature request from Dmitry, to make osd tree show primary-affinity value: http://tracker.ceph.com/issues/10036 This looks useful in some cases and is simple to implement, so here is the patch: https://github.com/trociny/ceph/compare/feature-10036 But before sending a pull request, I'd like to hear other people opinion. I wonder if it would be useful for most of the users or they would rather consider this as a white noice in osd tree output? Note, currently primary-affinity can be obtain by other ways, e.g: ceph -f json-pretty osd dump Here is an example of osd tree output after the change: % ceph osd tree # id weight type name up/down reweightprimary-affinity -1 3 root default -2 3 host zhuzha 0 1 osd.0 up 1 0.5 1 1 osd.1 up 1 0.75 2 1 osd.2 up 1 1 % ceph -f json-pretty osd tree { nodes: [ { id: -1, name: default, type: root, type_id: 10, children: [ -2]}, { id: -2, name: zhuzha, type: host, type_id: 1, children: [ 2, 1, 0]}, { id: 0, name: osd.0, exists: 1, type: osd, type_id: 0, status: up, reweight: 1.00, primary_affinity: 0.50, crush_weight: 1.00, depth: 2}, { id: 1, name: osd.1, exists: 1, type: osd, type_id: 0, status: up, reweight: 1.00, primary_affinity: 0.75, crush_weight: 1.00, depth: 2}, { id: 2, name: osd.2, exists: 1, type: osd, type_id: 0, status: up, reweight: 1.00, primary_affinity: 1.00, crush_weight: 1.00, depth: 2}], stray: []} -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com