Re: [ceph-users] Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)

2019-07-29 Thread Mykola Golub
On Sat, Jul 27, 2019 at 06:08:58PM +0530, Ajitha Robert wrote:

> *1) Will there be any folder related to rbd-mirroring in /var/lib/ceph ? *

no

> *2) Is ceph rbd-mirror authentication mandatory?*

no. But why are you asking?

> *3)when even i create any cinder volume loaded with glance image i get the
> following error.. *
> 
> 2019-07-27 17:26:46.762571 7f93eb0a5780 20 librbd::api::Mirror: peer_list:
> 2019-07-27 17:27:07.541701 7f939d7fa700  0 rbd::mirror::ImageReplayer:
> 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down:
> remote image no longer exists: scheduling deletion
> 2019-07-27 17:27:16.766199 7f93eb0a5780 20 librbd::api::Mirror: peer_list:
> 2019-07-27 17:27:22.568970 7f939d7fa700  0 rbd::mirror::ImageReplayer:
> 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down:
> mirror image no longer exists
> 2019-07-27 17:27:46.769158 7f93eb0a5780 20 librbd::api::Mirror: peer_list:
> 2019

The log tells that the primary image was deleted by some reason and
the rbd-mirror scheduled the secondary (mirrored) image deletion. From
the logs it is not seen why the primary image was deleted. It might be
sinder but can't exlude some bug in the rbd-mirror, running on the
primary cluster, though I don't recall any issues like this.

> *Attimes i can able to create bootable cinder volume apart from the above
> errors, but certain times i face the following
> 
> example, For a 50 gb volume, Local image get created, but it couldnt create
> a mirror image

"Connection timed out" errors suggest you have a connectivity issue
between sites?

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)

2019-07-26 Thread Mykola Golub
On Fri, Jul 26, 2019 at 04:40:35PM +0530, Ajitha Robert wrote:
> Thank you for the clarification.
> 
> But i was trying with openstack-cinder.. when i load some data into the
> volume around 50gb, the image sync will stop by 5 % or something within
> 15%...  What could be the reason?

I suppose you see image sync stop in mirror status output? Could you
please provide an example? And I suppose you don't see any other
messages in rbd-mirror log apart from what you have already posted?
Depending on configuration rbd-mirror might log in several logs. Could
you please try to find all its logs? `lsof |grep 'rbd-mirror.*log'`
may be useful for this.

BTW, what rbd-mirror version are you running?

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)

2019-07-26 Thread Mykola Golub
On Fri, Jul 26, 2019 at 12:31:59PM +0530, Ajitha Robert wrote:
>  I have a rbd mirroring setup with primary and secondary clusters as peers
> and I have a pool enabled image mode.., In this i created a rbd image ,
> enabled with journaling.
> But whenever i enable mirroring on the image,  I m getting error in
> rbdmirror.log and  osd.log.
> I have increased the timeouts.. nothing worked and couldnt traceout the
> error
> please guide me to solve this error.
> 
> *Logs*
> http://paste.openstack.org/show/754766/

What do you mean by "nothing worked"? According to mirroring status
the image is mirroring: it is in "up+stopped" state on the primary as
expected, and in "up+replaying" state on the secondary with 0 entries
behind master.

The "failed to get omap key" error in the osd log is harmless, and
just a week ago the fix was merged upstream not to display it.

The cause of "InstanceWatcher: ... resending after timeout" error in
the rbd-mirror log is not clear but if it is not repeating it is
harmless too.

I see you were trying to map the image with krbd. It is expected to
fail as the krbd does not support "journaling" feature, which is
necessary for mirroring. You can access those images only with librbd
(e.g. mapping with rbd-nbd driver or via qemu).

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD image format v1 EOL ...

2019-02-22 Thread Mykola Golub
On Fri, Feb 22, 2019 at 02:43:36PM +0200, koukou73gr wrote:
> On 2019-02-20 17:38, Mykola Golub wrote:
> 
> > Note, if even rbd supported live (without any downtime) migration you
> > would still need to restart the client after the upgrate to a new
> > librbd with migration support.
> > 
> > 
> You could probably get away with executing the client with a new librbd
> version by live migrating the VM to an updated hypervisor.
> 
> At least, this is what I have been doing so far when updating Ceph client
> libraries having zero downtime.

Yes, and this is what I meant when I was writing about investigating a
possiblilty to migrate to the new format with zero downtime by using
VM live migration. If there were a way to execute "rbd migration
prepare $pool/$image" during live VM migration, exactly after the
source VM closes the rbd image but before the destination VM opens it,
it would do the trick (the destination VM would start using the new
format).

Right now I don't know if it is possible at all, because I am not
familiar with VM migration, e.g. I don't know if it opens the image on
the destination before or after it closes it on the source.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD image format v1 EOL ...

2019-02-20 Thread Mykola Golub
On Wed, Feb 20, 2019 at 10:22:47AM +0100, Jan Kasprzak wrote:

>   If I read the parallel thread about pool migration in ceph-users@
> correctly, the ability to migrate to v2 would still require to stop the client
> before the "rbd migration prepare" can be executed.

Note, if even rbd supported live (without any downtime) migration you
would still need to restart the client after the upgrate to a new
librbd with migration support.

So actually you can combine the upgrade with migration:

  upgrade client library
  stop client
  rbd migration prepare
  start client

and eventually:

  rbd migration execute
  rbd migration commit

And it would be interesting to investigate a possibility to replace
"stop/start client" steps with "migrating the VM to another (upgraded)
host" to avoid stopping the VM at all. The trick would be to execute
somehow "rbd migration prepare" after the the sourse VM closes the
image, but before the destination VM opens it.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] read-only mounts of RBD images on multiple nodes for parallel reads

2019-01-23 Thread Mykola Golub
On Tue, Jan 22, 2019 at 01:26:29PM -0800, Void Star Nill wrote:

> Regarding Mykola's suggestion to use Read-Only snapshots, what is the
> overhead of creating these snapshots? I assume these are copy-on-write
> snapshots, so there's no extra space consumed except for the metadata?

Yes.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests and high i/o / read rate on bluestore osds after upgrade 12.2.8 -> 12.2.10

2019-01-19 Thread Mykola Golub
On Fri, Jan 18, 2019 at 11:06:54AM -0600, Mark Nelson wrote:

> IE even though you guys set bluestore_cache_size to 1GB, it is being
> overridden by bluestore_cache_size_ssd.

Isn't it vice versa [1]?

[1] 
https://github.com/ceph/ceph/blob/luminous/src/os/bluestore/BlueStore.cc#L3976

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] read-only mounts of RBD images on multiple nodes for parallel reads

2019-01-18 Thread Mykola Golub
On Thu, Jan 17, 2019 at 10:27:20AM -0800, Void Star Nill wrote:
> Hi,
> 
> We am trying to use Ceph in our products to address some of the use cases.
> We think Ceph block device for us. One of the use cases is that we have a
> number of jobs running in containers that need to have Read-Only access to
> shared data. The data is written once and is consumed multiple times. I
> have read through some of the similar discussions and the recommendations
> on using CephFS for these situations, but in our case Block device makes
> more sense as it fits well with other use cases and restrictions we have
> around this use case.
> 
> The following scenario seems to work as expected when we tried on a test
> cluster, but we wanted to get an expert opinion to see if there would be
> any issues in production. The usage scenario is as follows:
> 
> - A block device is created with "--image-shared" options:
> 
> rbd create mypool/foo --size 4G --image-shared

"--image-shared" just means that the created image will have
"exclusive-lock" feature and all other features that depend on it
disabled. It is useful for scenarios when one wants simulteous write
access to the image (e.g. when using a shared-disk cluster fs like
ocfs2) and does not want a performance penalty due to "exlusive-lock"
being pinged-ponged between writers.

For your scenario it is not necessary but is ok.

> - The image is mapped to a host, formatted in ext4 format (or other file
> formats), mounted to a directory in read/write mode and data is written to
> it. Please note that the image will be mapped in exclusive write mode -- no
> other read/write mounts are allowed a this time.

The map "exclusive" option works only for images with "exclusive-lock"
feature enabled and prevent in this case automatic exclusive lock
transitions (ping-pong mentioned above) from one writer to
another. And in this case it will not prevent from mapping and
mounting it ro and probably even rw (I am not familiar enough with
kernel rbd implementation to be sure here), though in the last case
the write will fail.

> - The volume is unmapped from the host and then mapped on to N number of
> other hosts where it will be mounted in read-only mode and the data is read
> simultaneously from N readers
> 
> As mentioned above, this seems to work as expected, but we wanted to
> confirm that we won't run into any unexpected issues.

It should work. Although as you can see rbd hardly protects
simultaneous access in this case so it should be carefully organized on
higher level. But you may consider creating a snapshot after modifying
the image and mapping and mounting the snapshot on readers. This way
you even can modify the image without unmounting the readers and then
remap/remount the new snapshot. And you will have a rollback option as
a gratis.

Also there is a valid concern mentioned by others about ext4 might want
to flush the journal if it is not clean even when mounting ro. I
expect the mount will just fail in this case because the image is
mapped ro, but you might want to investigate how to improve this.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Mykola Golub
On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:

> I reconfigure the osd service from start, the journal was:

I am not quite sure I understand what you mean here.

> --
> 
> -- Unit ceph-osd@0.service has finished starting up.
> -- 
> -- The start-up result is RESULT.
> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80
> -1 Public network was set, but cluster network was not set
> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80
> -1 Using public network also for cluster network
> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365 7f6a27204e80
> -1 journal FileJournal::_open: disabling aio for non-block journal.  Use
> journal_force_aio to force use of a>
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414 7f6a27204e80
> -1 journal do_read_entry(6930432): bad header magic
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729 7f6a27204e80
> -1 osd.0 21 log_to_monitors {default=true}
> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for check of
> host 'localhost' was out of bounds.
> 
> --

Could you please post the full ceph-osd log somewhere? 
/var/log/ceph/ceph-osd.0.log

> but hang at the command: "rbd create libvirt-pool/dimage --size 10240 "

So it hungs forever now instead of returning the error?
What is `ceph -s` output?

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-05 Thread Mykola Golub
On Mon, Nov 05, 2018 at 06:14:09PM +0800, Dengke Du wrote:

> -1 osd.0 20 class rbd open got (2) No such file or directory

So rbd cls was not loaded. Look at the directory, returned by this
command:

  ceph-conf --name osd.0 -D | grep osd_class_dir

if it contains libcls_rbd.so. And if the list returned by this
command:

  ceph-conf --name osd.0 -D | grep osd_class_load_list

contains rbd.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-05 Thread Mykola Golub
On Mon, Nov 05, 2018 at 03:19:29PM +0800, Dengke Du wrote:
> Hi all
> 
> ceph: 13.2.2
> 
> When run command:
> 
>     rbd create libvirt-pool/dimage --size 10240
> 
> Error happen:
> 
>     rbd: create error: 2018-11-04 23:54:56.224 7ff22e7fc700 -1
> librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error
> creating RBD id object: (95) Operation not supported
>     (95) Operation not supported

Most likely the libcls_rbd.so pluging was not loaded by the osd by
some reason (not found?). Could you please try grepping the osd logs
for cls errors? You will probably need to restart an osd to get some
fresh ones on the osd start.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-nbd map question

2018-09-21 Thread Mykola Golub
Vikas, could you tell what version do you observe this on?

Because I can reproduce this only on jewel, and it has been fixed
starting since luminous 12.2.1 [1].

[1] http://tracker.ceph.com/issues/20426

On Wed, Sep 19, 2018 at 03:48:44PM -0400, Jason Dillaman wrote:
> Thanks for reporting this -- it looks like we broke the part where
> command-line config overrides were parsed out from the parsing. I've
> opened a tracker ticket against the issue [1].
> 
> On Wed, Sep 19, 2018 at 2:49 PM Vikas Rana  wrote:
> >
> > Hi there,
> >
> > With default cluster name "ceph" I can map rbd-nbd without any issue.
> >
> > But for a different cluster name, i'm not able to map image using rbd-nbd 
> > and getting
> >
> > root@vtier-P-node1:/etc/ceph# rbd-nbd --cluster cephdr map test-pool/testvol
> > rbd-nbd: unknown command: --cluster
> >
> >
> > I looked at the man page and the syntax looks right.
> > Can someone please help me on what I'm doing wrong?
> >
> > Thanks,
> > -Vikas
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> [1] http://tracker.ceph.com/issues/36089
> 
> -- 
> Jason
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Mykola Golub
On Tue, Apr 10, 2018 at 11:14:58PM -0400, Alex Gorbachev wrote:

> So Josef fixed the one issue that enables e.g. lsblk and sysfs size to
> reflect the correct siz on change.  However, partptobe and parted
> still do not detect the change, complete unmap and remap of rbd-nbd
> device and remount of the filesystem is required.

Does your rbd-nbd include this fix [1], targeted for v12.2.3?

[1] http://tracker.ceph.com/issues/22172

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-03-11 Thread Mykola Golub
On Sat, Mar 10, 2018 at 08:25:15PM -0500, Alex Gorbachev wrote:
> I am running into the problem described in
> https://lkml.org/lkml/2018/2/19/565 and
> https://tracker.ceph.com/issues/23137
> 
> I went ahead and built a custom kernel reverting the change
> https://github.com/torvalds/linux/commit/639812a1ed9bf49ae2c026086fbf975339cd1eef
> 
> After that a resize shows in lsblk and /sys/block/nbdX/size, but not
> in parted for a mounted filesystem.
> 
> Unmapping and remapping the NBD device shows the size in parted.

Note 639812a is only a part of changes. The more invasive changes are
in 29eaadc [1]. For me the most suspicious looks removing
bd_set_size() in nbd_size_update(), but this is just a wild guess.

I would recommend to contact the authors of the change. This would
also be a gentle remainder for Josef that he promised to fix this.

[1] 
https://github.com/torvalds/linux/commit/29eaadc0364943b6352e8994158febcb699c9f9b

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing clones

2018-02-19 Thread Mykola Golub
On Mon, Feb 19, 2018 at 10:17:55PM +0100, Karsten Becker wrote:
> BTW - how can I find out, which RBDs are affected by this problem. Maybe
> a copy/remove of the affected RBDs could help? But how to find out to
> which RBDs this PG belongs to?

In this case rbd_data.966489238e1f29.250b looks like the
problem object. To find out which RBD image it belongs to you can run
`rbd info /` command for every image in the pool, looking at
block_name_prefix field, until you find 'rbd_data.966489238e1f29'.

> 
> Best
> Karsten
> 
> On 19.02.2018 19:26, Karsten Becker wrote:
> > Hi.
> > 
> > Thank you for the tip. I just tried... but unfortunately the import aborts:
> > 
> >> Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
> >> snapset 0=[]:{}
> >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
> >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
> >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
> >> snapset 628=[24,21,17]:{18=[17],24=[24,21]}
> >> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void 
> >> SnapMapper::add_oid(const hobject_t&, const std::set&, 
> >> MapCacher::Transaction<std::__cxx11::basic_string, 
> >> ceph::buffer::list>*)' thread 7facba7de400 time 2018-02-19 19:24:18.917515
> >> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED 
> >> assert(r == -2)
> >>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> >> (stable)
> >>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> >> const*)+0x102) [0x7facb0c2a8f2]
> >>  2: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, 
> >> std::less, std::allocator > const&, 
> >> MapCacher::Transaction<std::__cxx11::basic_string<char, 
> >> std::char_traits, std::allocator >, 
> >> ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
> >>  3: (get_attrs(ObjectStore*, coll_t, ghobject_t, 
> >> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, 
> >> SnapMapper&)+0xafb) [0x55eef35f901b]
> >>  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, 
> >> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) 
> >> [0x55eef35f9ae8]
> >>  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> >> std::__cxx11::basic_string<char, std::char_traits, 
> >> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
> >>  6: (main()+0x3909) [0x55eef3561349]
> >>  7: (__libc_start_main()+0xf1) [0x7facae0892b1]
> >>  8: (_start()+0x2a) [0x55eef35e901a]
> >>  NOTE: a copy of the executable, or `objdump -rdS ` is needed 
> >> to interpret this.
> >> *** Caught signal (Aborted) **
> >>  in thread 7facba7de400 thread_name:ceph-objectstor
> >>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> >> (stable)
> >>  1: (()+0x913f14) [0x55eef3c10f14]
> >>  2: (()+0x110c0) [0x7facaf5020c0]
> >>  3: (gsignal()+0xcf) [0x7facae09bfcf]
> >>  4: (abort()+0x16a) [0x7facae09d3fa]
> >>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> >> const*)+0x28e) [0x7facb0c2aa7e]
> >>  6: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, 
> >> std::less, std::allocator > const&, 
> >> MapCacher::Transaction<std::__cxx11::basic_string<char, 
> >> std::char_traits, std::allocator >, 
> >> ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
> >>  7: (get_attrs(ObjectStore*, coll_t, ghobject_t, 
> >> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, 
> >> SnapMapper&)+0xafb) [0x55eef35f901b]
> >>  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, 
> >> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) 
> >> [0x55eef35f9ae8]
> >>  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> >> std::__cxx11::basic_string<char, std::char_traits, 
> >> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
> >>  10: (main()+0x3909) [0x55eef3561349]
> >>  11: (__libc_start_main()+0xf1) [0x7facae0892b1]
> >>  12: (_start()+0x2a) [0x55eef35e901a]
> >> Aborted

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-fuse performance

2017-06-28 Thread Mykola Golub
On Tue, Jun 27, 2017 at 07:17:22PM -0400, Daniel K wrote:

> rbd-nbd isn't good as it stops at 16 block devices (/dev/nbd0-15)

modprobe nbd nbds_max=1024

Or, if nbd module is loaded by rbd-nbd, use --nbds_max command line
option.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cannot open /dev/xvdb: Input/output error

2017-06-26 Thread Mykola Golub
On Mon, Jun 26, 2017 at 07:12:31PM +0200, Massimiliano Cuttini wrote:

> >In your case (rbd-nbd) this error is harmless. You can avoid them
> >setting in ceph.conf, [client] section something like below:
> >
> >  admin socket = /var/run/ceph/$name.$pid.asok
> >
> >Also to make every rbd-nbd process to log to a separate file you can
> >set (in [client] section):
> >
> >  log file = /var/log/ceph/$name.$pid.log
> I need to create all the user in ceph cluster before use this.
> At the moment all the cluster was runnig with ceph admin keyring.
> However, this is not an issue, I  can rapidly deploy all user
> >needed.

I don't understand about this. I think just adding these parameters to
ceph.conf should work.

> 
> >>root 12610  0.0  0.2 1836768 11412 ?   Sl   Jun23   0:43 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-602b05be-395d-442e-bd68-7742deaf97bd
> >> --name client.admin
> >>root 17298  0.0  0.2 1644244 8420 ?Sl   21:15   0:01 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-3e16395d-7dad-4680-a7ad-7f398da7fd9e
> >> --name client.admin
> >>root 18116  0.0  0.2 1570512 8428 ?Sl   21:15   0:01 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-41a76fe7-c9ff-4082-adb4-43f3120a9106
> >> --name client.admin
> >>root 19063  0.1  1.3 2368252 54944 ?   Sl   21:15   0:10 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-6da2154e-06fd-4063-8af5-ae86ae61df50
> >> --name client.admin
> >>root 21007  0.0  0.2 1570512 8644 ?Sl   21:15   0:01 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-c8aca7bd-1e37-4af4-b642-f267602e210f
> >> --name client.admin
> >>root 21226  0.0  0.2 1703640 8744 ?Sl   21:15   0:01 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-cf2139ac-b1c4-404d-87da-db8f992a3e72
> >> --name client.admin
> >>root 21615  0.5  1.4 2368252 60256 ?   Sl   21:15   0:33 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-acb2a9b0-e98d-474e-aa42-ed4e5534ddbe
> >> --name client.admin
> >>root 21653  0.0  0.2 1703640 11100 ?   Sl   04:12   0:14 rbd-nbd 
> >>--nbds_max 64 map 
> >>RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-8631ab86-c85c-407b-9e15-bd86e830ba74
> >> --name client.admin
> >Do you observe the issue for all these volumes? I see many of them
> >were started recently (21:15) while other are older.
> Only some of them.
> But it's randomly.
> Some of old and some just plugged becomes unavailable to xen.

Do you mean by "unavailable" that image is corrupted or does it
reports IO errors? If this is the first case then it was corrupted
some time ago and we would need logs for that period to understand
what happened.

> >Don't you observe sporadic crashes/restarts of rbd-nbd processes? You
> >can associate a nbd device with rbd-nbd process (and rbd volume)
> >looking at /sys/block/nbd*/pid and ps output.
> I really don't know where to look for the rbd-nbd log.
> Can you point it out?

According to some of your previous messages rbd-nbd is writing to
/var/log/ceph/client.log:

> Under /var/log/ceph/client.log
> I see this error:
> 
> 2017-06-25 05:25:32.833202 7f658ff04e00  0 ceph version 10.2.7
> (50e863e0f4bc8f4b9e31156de690d765af245185), process rbd-nbd, pid 8524

You could look for errors in older log files if they are rotated.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cannot open /dev/xvdb: Input/output error

2017-06-26 Thread Mykola Golub
On Sun, Jun 25, 2017 at 11:28:37PM +0200, Massimiliano Cuttini wrote:
> 
> Il 25/06/2017 21:52, Mykola Golub ha scritto:
> >On Sun, Jun 25, 2017 at 06:58:37PM +0200, Massimiliano Cuttini wrote:
> >>I can see the error even if I easily run list-mapped:
> >>
> >># rbd-nbd list-mapped
> >>/dev/nbd0
> >>2017-06-25 18:49:11.761962 7fcdd9796e00 -1 asok(0x7fcde3f72810) 
> >> AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed 
> >> to bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': 
> >> (17) File exists/dev/nbd1
> >"AdminSocket::bind_and_listen: failed to bind" errors are harmless,
> >you can safely ignore them (or configure admin_socket in ceph.conf to
> >avoid names collisions).
> I read around that this can lead to a lock in the opening.
> http://tracker.ceph.com/issues/7690
> If the daemon exists than you have to wait that it ends its operation before
> you can connect.

In your case (rbd-nbd) this error is harmless. You can avoid them
setting in ceph.conf, [client] section something like below:

 admin socket = /var/run/ceph/$name.$pid.asok

Also to make every rbd-nbd process to log to a separate file you can
set (in [client] section):

 log file = /var/log/ceph/$name.$pid.log

> root 12610  0.0  0.2 1836768 11412 ?   Sl   Jun23   0:43 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-602b05be-395d-442e-bd68-7742deaf97bd
>  --name client.admin
> root 17298  0.0  0.2 1644244 8420 ?Sl   21:15   0:01 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-3e16395d-7dad-4680-a7ad-7f398da7fd9e
>  --name client.admin
> root 18116  0.0  0.2 1570512 8428 ?Sl   21:15   0:01 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-41a76fe7-c9ff-4082-adb4-43f3120a9106
>  --name client.admin
> root 19063  0.1  1.3 2368252 54944 ?   Sl   21:15   0:10 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-6da2154e-06fd-4063-8af5-ae86ae61df50
>  --name client.admin
> root 21007  0.0  0.2 1570512 8644 ?Sl   21:15   0:01 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-c8aca7bd-1e37-4af4-b642-f267602e210f
>  --name client.admin
> root 21226  0.0  0.2 1703640 8744 ?Sl   21:15   0:01 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-cf2139ac-b1c4-404d-87da-db8f992a3e72
>  --name client.admin
> root 21615  0.5  1.4 2368252 60256 ?   Sl   21:15   0:33 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-acb2a9b0-e98d-474e-aa42-ed4e5534ddbe
>  --name client.admin
> root 21653  0.0  0.2 1703640 11100 ?   Sl   04:12   0:14 rbd-nbd 
> --nbds_max 64 map 
> RBD_XenStorage-51a45fd8-a4d1-4202-899c-00a0f81054cc/VHD-8631ab86-c85c-407b-9e15-bd86e830ba74
>  --name client.admin

Do you observe the issue for all these volumes? I see many of them
were started recently (21:15) while other are older.

Don't you observe sporadic crashes/restarts of rbd-nbd processes? You
can associate a nbd device with rbd-nbd process (and rbd volume)
looking at /sys/block/nbd*/pid and ps output.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cannot open /dev/xvdb: Input/output error

2017-06-25 Thread Mykola Golub
On Sun, Jun 25, 2017 at 06:58:37PM +0200, Massimiliano Cuttini wrote:
> I can see the error even if I easily run list-mapped:
> 
># rbd-nbd list-mapped
>/dev/nbd0
>2017-06-25 18:49:11.761962 7fcdd9796e00 -1 asok(0x7fcde3f72810) 
> AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to 
> bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.asok': (17) 
> File exists/dev/nbd1

"AdminSocket::bind_and_listen: failed to bind" errors are harmless,
you can safely ignore them (or configure admin_socket in ceph.conf to
avoid names collisions).

Don't you see other errors?

What is output for `ps auxww |grep rbd-nbd`?

As the first step you could try to export images to file using `rbd
export`, see if it succeeds and probably investigate the content.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help needed rbd feature enable

2017-06-24 Thread Mykola Golub
e 'VHD-4c7ebb38-b081-48da-9b57-aac14bdf88c4':
> >>> size 102400 MB in 51200 objects
> >>> order 21 (2048 kB objects)
> >>> block_name_prefix: rbd_data.5fde2ae8944a
> >>> format: 2
> >>> features:
> >>> flags:
> >>>
> >>>try to enabling them will get this error:
> >>>
> >>>rbd: failed to update image features: (22) Invalid argument
> >>>2017-06-23 21:20:03.748746 7fdec1b34d80 -1 librbd: cannot 
> >>> update immutable features
> >>>
> >>>I read on the guide I shoulded had place in the
> >>>config|rbd_default_features|
> >>>
> >>>What can I do now to enable this feature all feature of
> >>>jewel on all images?
> >>>Can I insert all the feature of jewel or is there any issue
> >>>with old kernel?
> >>>
> >>>|
> >>>|
> >>>
> >>>|Thanks,
> >>>Max
> >>>|
> >>>
> >>>___
> >>>ceph-users mailing list
> >>>ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> >>>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >>
> >
> >
> >
> >___
> >ceph-users mailing list
> >ceph-users@lists.ceph.com
> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] removing 'rados cppool' command

2016-05-07 Thread Mykola Golub
On Fri, May 06, 2016 at 03:41:34PM -0400, Sage Weil wrote:
> This PR
> 
>   https://github.com/ceph/ceph/pull/8975
> 
> removes the 'rados cppool' command.  The main problem is that the command 
> does not make a faithful copy of all data because it doesn't preserve the 
> snapshots (and snapshot related metadata).  That means if you copy an RBD 
> pool it will render the images somewhat broken (snaps won't be present and 
> won't work properly).  It also doesn't preserve the user_version field 
> that some librados users may rely on.
> 
> Since it's obscure and of limited use, this PR just removes it.

Copying a pool sometimes is useful, even with those limitations.

Until there is an alternative way to do the same I would not remove
this. A better approach to me is to move this functionality to
something like `ceph_radostool` (people use such tools only when
facing extraordinary situations so they are more careful and expect
limitations).

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD image mounted by command "rbd-nbd" the status is read-only.

2016-04-28 Thread Mykola Golub
On Mon, Apr 25, 2016 at 08:09:54PM +0200, Ilya Dryomov wrote:
> On Mon, Apr 25, 2016 at 7:47 PM, Stefan Lissmats <ste...@trimmat.se> wrote:
> > Hello again!
> >
> > I understand that it's not recommended running osd and rbd-nbd on the same 
> > host and i actually moved my rbd-nbd to a completely clean host (same 
> > kernel and OS though), but with same result.
> >
> > I hope someone can resolve this and you seem to indicate it is some kind of 
> > known error but i didn't really understand the github commit that you 
> > linked.
> 
> Yes, it is a bug.  rbd-nbd code expects writes to have rval (return
> code) equal to the size of the write.  I'm pretty sure that's wrong,
> because rval for writes should be 0 or a negative error.
> 
> I think what happens is your writes complete successfully, but rbd-nbd
> then throws an -EIO to the kernel because 0 != write size.  I could be
> wrong, so let's wait for Mykola to chime in - he added that check to
> fix discards.

Sorry for delay (I missed this thread due to a wrong filter).

I don't recall details but I think I had an impression that on success
aio_write completion returned the number of bytes written. I might be
confused by this test that checks for r >= 0:

https://github.com/ceph/ceph/blob/master/src/test/librbd/test_librbd.cc#L1254

Now, looking at it again, it is certainly not true and my patch is
wrong.

I see the fix is already requested:

https://github.com/ceph/ceph/pull/8775/

Thanks.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] incomplete pg, recovery some data

2015-06-19 Thread Mykola Golub
On Thu, Jun 18, 2015 at 01:24:38PM +0200, Mateusz Skała wrote:
 Hi,
 
 After some hardware errors one of pg in our backup server is 'incomplete'.
 
 I do export pg without problems like here:
 https://ceph.com/community/incomplete-pgs-oh-my/
 
 After remove pg from all osd's and  import pg to one of osd pg is still
 'incomplete'.
 
 I want to  recover only some pice of data from this rbd so if I lost
 something then nothing happened. How can I tell ceph to accept this pg as
 complete and clean?

I have a patch for ceph-objectstore-tool, which adds mark-complete operation,
as it has been suggested by Sam in http://tracker.ceph.com/issues/10098

https://github.com/ceph/ceph/pull/5031

It has not been reviewed yet and not tested well though because I
don't know a simple way how to get an incomplete pg.

You might want to try it on your own risk.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd - No Longer Creates osd.X upon Launch - Bug ?

2015-02-16 Thread Mykola Golub
On Sun, Feb 15, 2015 at 5:39 PM, Sage Weil s...@newdream.net wrote:
 On Sun, 15 Feb 2015, Mykola Golub wrote:
 The ceph osd create could be extended to have OSD ID as a second
 optional argument (the first is already used for uuid).

   ceph osd create uuid id

 The command would succeed only if the ID were not in use.

 Ron, would this work for you?

 I have a patch as a proof of concept:

 https://github.com/trociny/ceph/compare/wip-osd_create

 This looks reasonable to me!

 Do you mind adding a few test cases in qa/workunits/cephtool/test.sh to go
 along with it?

https://github.com/ceph/ceph/pull/3743/commits

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd - No Longer Creates osd.X upon Launch - Bug ?

2015-02-15 Thread Mykola Golub
On Thu, Feb 05, 2015 at 08:33:39AM -0700, Ron Allred wrote:
 Hello,
 
 The latest ceph-osd in Firefly v0.80.8, no longer auto creates its osd.X
 entry, in the osd map, which it was assigned via ceph.conf.
 
 I am very aware documentation states ceph osd create, can do this job,
 but this command only assigns the next sequential osd.X number.  This is
 highly undesirable.  For _years_, we have assigned number ranges to each
 OSD server for an organized multi-tier (SSD / SAS / SATA) crush map.
 (leaving gaps in osd numbering, naturally.)  Skipping 'ceph osd create'
 entirely.
 
 We are now facing a problem that an OSD remove+replace, now can't use
 it's former osd.X ID.  Making a huge mess of documentation, number
 patterning, and disk labeling.
 
 Is there a work-around to forcefully create an osd.X number??

The ceph osd create could be extended to have OSD ID as a second
optional argument (the first is already used for uuid).
 
  ceph osd create uuid id

The command would succeed only if the ID were not in use.

Ron, would this work for you?

I have a patch as a proof of concept:

https://github.com/trociny/ceph/compare/wip-osd_create

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd - No Longer Creates osd.X upon Launch - Bug ?

2015-02-15 Thread Mykola Golub
On Sun, Feb 15, 2015 at 06:24:45PM -0800, Sage Weil wrote:
 On Sun, 15 Feb 2015, Gregory Farnum wrote:
  On Sun, Feb 15, 2015 at 5:39 PM, Sage Weil s...@newdream.net wrote:
   On Sun, 15 Feb 2015, Mykola Golub wrote:

   https://github.com/trociny/ceph/compare/wip-osd_create
  
   This looks reasonable to me!
  
   Do you mind adding a few test cases in qa/workunits/cephtool/test.sh to go
   along with it?

Will do. Thanks.

  
   Usual disclaimer: we discourage getting creative with the osd ids because
   they are allocated as an *array* in memory, so skipping entries consumes
   some extra memory.. this can become significant if there are large
   gaps and/or clusters are large.
  
  These options used to exist and were removed quite deliberately. I
  don't remember the entire conversation at this point but we'll need to
  find and address the concerns raised then before reintroducing the
  ability to explicitly set OSD IDs. IIRC I was on the losing end of
  this, because it's definitely behavior we should be offering to
  admins, but the issues were significant enough we had to eliminate the
  option. Methods of preserving the user-facing utility like adding OSD
  names were deemed too difficult to implement. :(
  
  (I think it largely had to do with serious issues over the
  availability and location of data when OSDs disappear, but new ones
  with the same ID are present. And what you do when somebody then
  resurrects the original OSDs. But there might have been other things
  too.)
 
 The part I remember was just that 'ceph osd create id' wasn't a use and 
 idempotent command.  I don't think reusing ids is the problem, though if 
 it is then it is still a problem since osd create will re-use the first 
 available id.  I think all this option lets us do that we didn't before is 
 leave gaps in the id space?

I think so too -- leaving gaps should be the only difference comparing
to what we already have. I was wandering though if I might need to do
something with those nonexistent IDs in the gap, e.g. pre-allocating
them explicitly with some flags combination. It looks like I don't...

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-01 Thread Mykola Golub
On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote:
 Hi Bruce,
 you can also look on the mon, like
 ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep cache

rbd cache is a client setting, so you have to check this connecting to
the client admin socket. Its location is defined in ceph.conf,
[client] section, admin socket parameter.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd tree to show primary-affinity value

2015-01-07 Thread Mykola Golub
On Thu, Dec 25, 2014 at 03:57:15PM +1100, Dmitry Smirnov wrote:

 Please don't withhold this improvement -- go ahead and submit pull request to 
 let developers decide whether they want this or not. IMHO it is a very useful 
 improvement. Thank you very much for implementing it.

Done. https://github.com/ceph/ceph/pull/3254

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd tree to show primary-affinity value

2014-12-24 Thread Mykola Golub
Hi,

I stumbled upon this feature request from Dmitry, to make osd tree
show primary-affinity value:

  http://tracker.ceph.com/issues/10036

This looks useful in some cases and is simple to implement, so here is
the patch:

  https://github.com/trociny/ceph/compare/feature-10036

But before sending a pull request, I'd like to hear other people
opinion. I wonder if it would be useful for most of the users or they
would rather consider this as a white noice in osd tree output?

Note, currently primary-affinity can be obtain by other ways, e.g:

 ceph -f json-pretty osd dump

Here is an example of osd tree output after the change:

 % ceph osd tree
 # id   weight  type name   up/down reweightprimary-affinity
 -1 3   root default
 -2 3   host zhuzha
 0  1   osd.0   up  1   0.5
 1  1   osd.1   up  1   0.75
 2  1   osd.2   up  1   1
 
 % ceph -f json-pretty osd tree
 
 { nodes: [
 { id: -1,
   name: default,
   type: root,
   type_id: 10,
   children: [
 -2]},
 { id: -2,
   name: zhuzha,
   type: host,
   type_id: 1,
   children: [
 2,
 1,
 0]},
 { id: 0,
   name: osd.0,
   exists: 1,
   type: osd,
   type_id: 0,
   status: up,
   reweight: 1.00,
   primary_affinity: 0.50,
   crush_weight: 1.00,
   depth: 2},
 { id: 1,
   name: osd.1,
   exists: 1,
   type: osd,
   type_id: 0,
   status: up,
   reweight: 1.00,
   primary_affinity: 0.75,
   crush_weight: 1.00,
   depth: 2},
 { id: 2,
   name: osd.2,
   exists: 1,
   type: osd,
   type_id: 0,
   status: up,
   reweight: 1.00,
   primary_affinity: 1.00,
   crush_weight: 1.00,
   depth: 2}],
   stray: []}

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com