Re: [ceph-users] Upgrading and lost OSDs

2019-11-22 Thread Brent Kennedy
I just ran into this today with a server we rebooted.  The server has been 
upgraded to Nautilus 14.2.2 for a few months.  Was originally installed as 
Jewel, then upgraded to Luminous ( then Nautilus ).   I have a whole server 
where all 12 OSDs have empty folders.  I recreated the keyring file and the 
type file, but now I have the “bluestore(/var/lib/ceph/osd/ceph-80/block) 
_read_bdev_label failed to open /var/lib/ceph/osd/ceph-80/block: (2) No such 
file or directory”  error.

 

Nov 22 23:10:58 ukosdhost15 systemd: Starting Ceph object storage daemon 
osd.80...

Nov 22 23:10:58 ukosdhost15 systemd: Started Ceph object storage daemon osd.80.

Nov 22 23:10:58 ukosdhost15 ceph-osd: 2019-11-22 23:10:58.662 7f86ebe92d80 -1 
bluestore(/var/lib/ceph/osd/ceph-80/block) _read_bdev_label failed to open 
/var/lib/ceph/osd/ceph-80/block: (2) No such file or directory

Nov 22 23:10:58 ukosdhost15 ceph-osd: 2019-11-22 23:10:58.662 7f86ebe92d80 -1 
#033[0;31m ** ERROR: unable to open OSD superblock on 
/var/lib/ceph/osd/ceph-80: (2) No such file or directory#033[0m

Nov 22 23:10:58 ukosdhost15 systemd: ceph-osd@80.service: main process exited, 
code=exited, status=1/FAILURE

 

Were you able to restore those OSDs?  I was adding 24 more OSDs when a network 
issue occurred and this server was rebooted as part of that ( and the OSDs died 
on it ).

 

-Brent

 

From: ceph-users  On Behalf Of Alfredo Deza
Sent: Friday, July 26, 2019 12:48 PM
To: Bob R 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Upgrading and lost OSDs

 

 

 

On Thu, Jul 25, 2019 at 7:00 PM Bob R mailto:b...@drinksbeer.org> > wrote:

I would try 'mv /etc/ceph/osd{,.old}' then run 'ceph-volume  simple scan' 
again. We had some problems upgrading due to OSDs (perhaps initially installed 
as firefly?) missing the 'type' attribute and iirc the 'ceph-volume simple 
scan' command refused to overwrite existing json files after I made some 
changes to ceph-volume. 

 

Ooof. I could swear that this issue was fixed already and it took me a while to 
find out that it wasn't at all. We saw this a few months ago in our Long 
Running Cluster used for dogfooding. 

 

I've created a ticket to track this work at http://tracker.ceph.com/issues/40987

 

But what you've done is exactly why we chose to persist the JSON files in 
/etc/ceph/osd/*.json, so that an admin could tell if anything is missing (or 
incorrect like in this case) and make the changes needed.

 

 

 

Bob

 

On Wed, Jul 24, 2019 at 1:24 PM Alfredo Deza mailto:ad...@redhat.com> > wrote:

 

 

On Wed, Jul 24, 2019 at 4:15 PM Peter Eisch mailto:peter.ei...@virginpulse.com> > wrote:

Hi,

 

I appreciate the insistency that the directions be followed.  I wholly agree.  
The only liberty I took was to do a ‘yum update’ instead of just ‘yum update 
ceph-osd’ and then reboot.  (Also my MDS runs on the MON hosts, so it got 
update a step early.)

 

As for the logs:

 

[2019-07-24 15:07:22,713][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple scan

[2019-07-24 15:07:22,714][ceph_volume.process][INFO  ] Running command: 
/bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*

[2019-07-24 15:07:27,574][ceph_volume.main][INFO  ] Running command: 
ceph-volume  simple activate --all

[2019-07-24 15:07:27,575][ceph_volume.devices.simple.activate][INFO  ] 
activating OSD specified in 
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json

[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] Required 
devices (block and data) not present for bluestore

[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] 
bluestore devices found: [u'data']

[2019-07-24 15:07:27,576][ceph_volume][ERROR ] exception caught by decorator

Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, 
in newfunc

return f(*a, **kw)

  File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in main

terminal.dispatch(self.mapper, subcommand_args)

  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in 
dispatch

instance.main()

  File "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/main.py", 
line 33, in main

terminal.dispatch(self.mapper, self.argv)

  File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in 
dispatch

instance.main()

  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
272, in main

self.activate(args)

  File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
in is_root

return func(*a, **kw)

  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
131, in activate

self.validate_devices(osd_metadata)

  File 
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line 
62, in validate_devices

raise RuntimeError('Unable to activate bluestore OSD due to 

Re: [ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

2019-11-22 Thread Paul Emmerich
On Fri, Nov 22, 2019 at 9:09 PM J. Eric Ivancich  wrote:

> 2^64 (2 to the 64th power) is 18446744073709551616, which is 13 greater
> than your value of 18446744073709551603. So this likely represents the
> value of -13, but displayed in an unsigned format.

I've seen this with values between -2 and -10, see
https://tracker.ceph.com/issues/37942


Paul

>
> Obviously is should not calculate a value of -13. I'm guessing it's a
> bug when bucket index entries that are categorized as rgw.none are
> found, we're not adding to the stats, but when they're removed they are
> being subtracted from the stats.
>
> Interestingly resharding recalculates these, so you'll likely have a
> much smaller value when you're done.
>
> It seems the operations that result in rgw.none bucket index entries are
> cancelled operations and removals.
>
> We're currently looking at how best to deal with rgw.none stats here:
>
> https://github.com/ceph/ceph/pull/29062
>
> Eric
>
> --
> J. Eric Ivancich
> he/him/his
> Red Hat Storage
> Ann Arbor, Michigan, USA
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

2019-11-22 Thread J. Eric Ivancich
On 11/22/19 11:50 AM, David Monschein wrote:
> Hi all. Running an Object Storage cluster with Ceph Nautilus 14.2.4.
> 
> We are running into what appears to be a serious bug that is affecting
> our fairly new object storage cluster. While investigating some
> performance issues -- seeing abnormally high IOPS, extremely slow bucket
> stat listings (over 3 minutes) -- we noticed some dynamic bucket
> resharding jobs running. Strangely enough they were resharding buckets
> that had very few objects. Even more worrying was the number of new
> shards Ceph was planning: 65521
> 
> [root@os1 ~]# radosgw-admin reshard list
> [
>     {
>         "time": "2019-11-22 00:12:40.192886Z",
>         "tenant": "",
>         "bucket_name": "redacteed",
>         "bucket_id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>         "new_instance_id":
> "redacted:c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7552496.28",
>         "old_num_shards": 1,
>         "new_num_shards": 65521
>     }
> ]
> 
> Upon further inspection we noticed a seemingly impossible number of
> objects (18446744073709551603) in rgw.none for the same bucket:
> [root@os1 ~]# radosgw-admin bucket stats --bucket=redacted
> {
>     "bucket": "redacted",
>     "tenant": "",
>     "zonegroup": "dbb69c5b-b33f-4af2-950c-173d695a4d2c",
>     "placement_rule": "default-placement",
>     "explicit_placement": {
>         "data_pool": "",
>         "data_extra_pool": "",
>         "index_pool": ""
>     },
>     "id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>     "marker": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
>     "index_type": "Normal",
>     "owner": "d52cb8cc-1f92-47f5-86bf-fb28bc6b592c",
>     "ver": "0#12623",
>     "master_ver": "0#0",
>     "mtime": "2019-11-22 00:18:41.753188Z",
>     "max_marker": "0#",
>     "usage": {
>         "rgw.none": {
>             "size": 0,
>             "size_actual": 0,
>             "size_utilized": 0,
>             "size_kb": 0,
>             "size_kb_actual": 0,
>             "size_kb_utilized": 0,
>             "num_objects": 18446744073709551603
>         },
>         "rgw.main": {
>             "size": 63410030,
>             "size_actual": 63516672,
>             "size_utilized": 63410030,
>             "size_kb": 61924,
>             "size_kb_actual": 62028,
>             "size_kb_utilized": 61924,
>             "num_objects": 27
>         },
>         "rgw.multimeta": {
>             "size": 0,
>             "size_actual": 0,
>             "size_utilized": 0,
>             "size_kb": 0,
>             "size_kb_actual": 0,
>             "size_kb_utilized": 0,
>             "num_objects": 0
>         }
>     },
>     "bucket_quota": {
>         "enabled": false,
>         "check_on_raw": false,
>         "max_size": -1,
>         "max_size_kb": 0,
>         "max_objects": -1
>     }
> }
> 
> It would seem that the unreal number of objects in rgw.none is driving
> the resharding process, making ceph reshard the bucket 65521 times. I am
> assuming 65521 is the limit.
> 
> I have seen only a couple of references to this issue, none of which had
> a resolution or much of a conversation around them:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030791.html
> https://tracker.ceph.com/issues/37942
> 
> For now we are cancelling these resharding jobs since they seem to be
> causing performance issues with the cluster, but this is an untenable
> solution. Does anyone know what is causing this? Or how to prevent
> it/fix it?


2^64 (2 to the 64th power) is 18446744073709551616, which is 13 greater
than your value of 18446744073709551603. So this likely represents the
value of -13, but displayed in an unsigned format.

Obviously is should not calculate a value of -13. I'm guessing it's a
bug when bucket index entries that are categorized as rgw.none are
found, we're not adding to the stats, but when they're removed they are
being subtracted from the stats.

Interestingly resharding recalculates these, so you'll likely have a
much smaller value when you're done.

It seems the operations that result in rgw.none bucket index entries are
cancelled operations and removals.

We're currently looking at how best to deal with rgw.none stats here:

https://github.com/ceph/ceph/pull/29062

Eric

-- 
J. Eric Ivancich
he/him/his
Red Hat Storage
Ann Arbor, Michigan, USA

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD Mirror DR Testing

2019-11-22 Thread Jason Dillaman
On Fri, Nov 22, 2019 at 11:16 AM Vikas Rana  wrote:
>
> Hi All,
>
> We have a XFS filesystems on Prod side and when we trying to mount the DR 
> copy, we get superblock error
>
> root@:~# rbd-nbd map nfs/dir
> /dev/nbd0
> root@:~# mount /dev/nbd0 /mnt
> mount: /dev/nbd0: can't read superblock

Doesn't look like you are mapping at a snapshot.

>
> Any suggestions to test the DR copy any other way or if I'm doing something 
> wrong?
>
> Thanks,
> -Vikas
>
> -Original Message-
> From: Jason Dillaman 
> Sent: Thursday, November 21, 2019 10:24 AM
> To: Vikas Rana 
> Cc: dillaman ; ceph-users 
> Subject: Re: [ceph-users] RBD Mirror DR Testing
>
> On Thu, Nov 21, 2019 at 10:16 AM Vikas Rana  wrote:
> >
> > Thanks Jason.
> > We are just mounting and verifying the directory structure and make sure it 
> > looks good.
> >
> > My understanding was, in 12.2.10, we can't mount the DR snapshot as the RBD 
> > image is non-primary. Is this wrong?
>
> You have always been able to access non-primary images for read-only 
> operations (only writes are prevented):
>
> $ rbd info test
> rbd image 'test':
> <... snip ...>
> mirroring primary: false
>
> $ rbd device --device-type nbd map test@1
> /dev/nbd0
> $ mount /dev/nbd0 /mnt/
> mount: /mnt: WARNING: device write-protected, mounted read-only.
> $ ll /mnt/
> total 0
> -rw-r--r--. 1 root root 0 Nov 21 10:20 hello.world
>
> > Thanks,
> > -Vikas
> >
> > -Original Message-
> > From: Jason Dillaman 
> > Sent: Thursday, November 21, 2019 9:58 AM
> > To: Vikas Rana 
> > Cc: ceph-users 
> > Subject: Re: [ceph-users] RBD Mirror DR Testing
> >
> > On Thu, Nov 21, 2019 at 9:56 AM Jason Dillaman  wrote:
> > >
> > > On Thu, Nov 21, 2019 at 8:49 AM Vikas Rana  wrote:
> > > >
> > > > Thanks Jason for such a quick response. We are on 12.2.10.
> > > >
> > > > Checksuming a 200TB image will take a long time.
> > >
> > > How would mounting an RBD image and scanning the image be faster?
> > > Are you only using a small percentage of the image?
> >
> > ... and of course, you can mount an RBD snapshot in read-only mode.
> >
> > > > To test the DR copy by mounting it, these are the steps I'm
> > > > planning to follow 1. Demote the Prod copy and promote the DR copy
> > > > 2. Do we have to recreate the rbd mirror relationship going from DR to 
> > > > primary?
> > > > 3. Mount and validate the data
> > > > 4. Demote the DR copy and promote the Prod copy 5. Revert the peer
> > > > relationship if required?
> > > >
> > > > Did I do it right or miss anything?
> > >
> > > You cannot change the peers or you will lose the relationship. If
> > > you insist on your course of action, you just need to be configured
> > > for two-way mirroring and leave it that way.
> > >
> > > >
> > > > Thanks,
> > > > -Vikas
> > > >
> > > > -Original Message-
> > > > From: Jason Dillaman 
> > > > Sent: Thursday, November 21, 2019 8:33 AM
> > > > To: Vikas Rana 
> > > > Cc: ceph-users 
> > > > Subject: Re: [ceph-users] RBD Mirror DR Testing
> > > >
> > > > On Thu, Nov 21, 2019 at 8:29 AM Vikas Rana  wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > >
> > > > >
> > > > > We have a 200TB RBD image which we are replicating using RBD 
> > > > > mirroring.
> > > > >
> > > > > We want to test the DR copy and make sure that we have a consistent 
> > > > > copy in case primary site is lost.
> > > > >
> > > > >
> > > > >
> > > > > We did it previously and promoted the DR copy which broken the DR 
> > > > > copy from primary and we have to resync the whole 200TB data.
> > > > >
> > > > >
> > > > >
> > > > > Is there any correct way of doing it so we don’t have to resync all 
> > > > > 200TB again?
> > > >
> > > > Yes, create a snapshot on the primary site and let it propagate to the 
> > > > non-primary site. Then you can compare checksums at the snapshot w/o 
> > > > having to worry about the data changing. Once you have finished, delete 
> > > > the snapshot on the primary site and it will propagate over to the 
> > > > non-primary site.
> > > >
> > > > >
> > > > >
> > > > > Can we demote current primary and then promote the DR copy and test 
> > > > > and then revert back? Will that require the complete 200TB sync?
> > > > >
> > > >
> > > > It's only the forced-promotion that causes split-brain. If you 
> > > > gracefully demote from site A and promote site B, and then demote site 
> > > > B and promote site A, that will not require a sync. However, again, 
> > > > it's probably just easier to use a snapshot.
> > > >
> > > > >
> > > > > Thanks in advance for your help and suggestions.
> > > > >
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > -Vikas
> > > > >
> > > > > ___
> > > > > ceph-users mailing list
> > > > > ceph-users@lists.ceph.com
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > > >
> > > >
> > > > --
> > > > Jason
> > > >
> > > >
> > >
> > >
> > > --
> > > Jason
> >
> >
> >
> > --
> > Jason
> >
> >
>
>
> --

Re: [ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

2019-11-22 Thread Paul Emmerich
I've originally reported the linked issue. I've seen this problem with
negative stats on several of S3 setups but I could never figure out
how to reproduce it.

But I haven't seen the resharder act on these stats; that seems like a
particularly bad case :(


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Fri, Nov 22, 2019 at 5:51 PM David Monschein  wrote:
>
> Hi all. Running an Object Storage cluster with Ceph Nautilus 14.2.4.
>
> We are running into what appears to be a serious bug that is affecting our 
> fairly new object storage cluster. While investigating some performance 
> issues -- seeing abnormally high IOPS, extremely slow bucket stat listings 
> (over 3 minutes) -- we noticed some dynamic bucket resharding jobs running. 
> Strangely enough they were resharding buckets that had very few objects. Even 
> more worrying was the number of new shards Ceph was planning: 65521
>
> [root@os1 ~]# radosgw-admin reshard list
> [
> {
> "time": "2019-11-22 00:12:40.192886Z",
> "tenant": "",
> "bucket_name": "redacteed",
> "bucket_id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
> "new_instance_id": 
> "redacted:c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7552496.28",
> "old_num_shards": 1,
> "new_num_shards": 65521
> }
> ]
>
> Upon further inspection we noticed a seemingly impossible number of objects 
> (18446744073709551603) in rgw.none for the same bucket:
> [root@os1 ~]# radosgw-admin bucket stats --bucket=redacted
> {
> "bucket": "redacted",
> "tenant": "",
> "zonegroup": "dbb69c5b-b33f-4af2-950c-173d695a4d2c",
> "placement_rule": "default-placement",
> "explicit_placement": {
> "data_pool": "",
> "data_extra_pool": "",
> "index_pool": ""
> },
> "id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
> "marker": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
> "index_type": "Normal",
> "owner": "d52cb8cc-1f92-47f5-86bf-fb28bc6b592c",
> "ver": "0#12623",
> "master_ver": "0#0",
> "mtime": "2019-11-22 00:18:41.753188Z",
> "max_marker": "0#",
> "usage": {
> "rgw.none": {
> "size": 0,
> "size_actual": 0,
> "size_utilized": 0,
> "size_kb": 0,
> "size_kb_actual": 0,
> "size_kb_utilized": 0,
> "num_objects": 18446744073709551603
> },
> "rgw.main": {
> "size": 63410030,
> "size_actual": 63516672,
> "size_utilized": 63410030,
> "size_kb": 61924,
> "size_kb_actual": 62028,
> "size_kb_utilized": 61924,
> "num_objects": 27
> },
> "rgw.multimeta": {
> "size": 0,
> "size_actual": 0,
> "size_utilized": 0,
> "size_kb": 0,
> "size_kb_actual": 0,
> "size_kb_utilized": 0,
> "num_objects": 0
> }
> },
> "bucket_quota": {
> "enabled": false,
> "check_on_raw": false,
> "max_size": -1,
> "max_size_kb": 0,
> "max_objects": -1
> }
> }
>
> It would seem that the unreal number of objects in rgw.none is driving the 
> resharding process, making ceph reshard the bucket 65521 times. I am assuming 
> 65521 is the limit.
>
> I have seen only a couple of references to this issue, none of which had a 
> resolution or much of a conversation around them:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030791.html
> https://tracker.ceph.com/issues/37942
>
> For now we are cancelling these resharding jobs since they seem to be causing 
> performance issues with the cluster, but this is an untenable solution. Does 
> anyone know what is causing this? Or how to prevent it/fix it?
>
> Thanks,
> Dave Monschein
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

2019-11-22 Thread David Monschein
Hi all. Running an Object Storage cluster with Ceph Nautilus 14.2.4.

We are running into what appears to be a serious bug that is affecting our
fairly new object storage cluster. While investigating some performance
issues -- seeing abnormally high IOPS, extremely slow bucket stat listings
(over 3 minutes) -- we noticed some dynamic bucket resharding jobs running.
Strangely enough they were resharding buckets that had very few objects.
Even more worrying was the number of new shards Ceph was planning: 65521

[root@os1 ~]# radosgw-admin reshard list
[
{
"time": "2019-11-22 00:12:40.192886Z",
"tenant": "",
"bucket_name": "redacteed",
"bucket_id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
"new_instance_id":
"redacted:c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7552496.28",
"old_num_shards": 1,
"new_num_shards": 65521
}
]

Upon further inspection we noticed a seemingly impossible number of objects
(18446744073709551603) in rgw.none for the same bucket:
[root@os1 ~]# radosgw-admin bucket stats --bucket=redacted
{
"bucket": "redacted",
"tenant": "",
"zonegroup": "dbb69c5b-b33f-4af2-950c-173d695a4d2c",
"placement_rule": "default-placement",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
},
"id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
"marker": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.7000639.20",
"index_type": "Normal",
"owner": "d52cb8cc-1f92-47f5-86bf-fb28bc6b592c",
"ver": "0#12623",
"master_ver": "0#0",
"mtime": "2019-11-22 00:18:41.753188Z",
"max_marker": "0#",
"usage": {
"rgw.none": {
"size": 0,
"size_actual": 0,
"size_utilized": 0,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 0,
"num_objects": 18446744073709551603
},
"rgw.main": {
"size": 63410030,
"size_actual": 63516672,
"size_utilized": 63410030,
"size_kb": 61924,
"size_kb_actual": 62028,
"size_kb_utilized": 61924,
"num_objects": 27
},
"rgw.multimeta": {
"size": 0,
"size_actual": 0,
"size_utilized": 0,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 0,
"num_objects": 0
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
}

It would seem that the unreal number of objects in rgw.none is driving the
resharding process, making ceph reshard the bucket 65521 times. I am
assuming 65521 is the limit.

I have seen only a couple of references to this issue, none of which had a
resolution or much of a conversation around them:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030791.html
https://tracker.ceph.com/issues/37942

For now we are cancelling these resharding jobs since they seem to be
causing performance issues with the cluster, but this is an untenable
solution. Does anyone know what is causing this? Or how to prevent it/fix
it?

Thanks,
Dave Monschein
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD Mirror DR Testing

2019-11-22 Thread Vikas Rana
Hi All,

We have a XFS filesystems on Prod side and when we trying to mount the DR copy, 
we get superblock error

root@:~# rbd-nbd map nfs/dir 
/dev/nbd0
root@:~# mount /dev/nbd0 /mnt
mount: /dev/nbd0: can't read superblock


Any suggestions to test the DR copy any other way or if I'm doing something 
wrong?

Thanks,
-Vikas 

-Original Message-
From: Jason Dillaman  
Sent: Thursday, November 21, 2019 10:24 AM
To: Vikas Rana 
Cc: dillaman ; ceph-users 
Subject: Re: [ceph-users] RBD Mirror DR Testing

On Thu, Nov 21, 2019 at 10:16 AM Vikas Rana  wrote:
>
> Thanks Jason.
> We are just mounting and verifying the directory structure and make sure it 
> looks good.
>
> My understanding was, in 12.2.10, we can't mount the DR snapshot as the RBD 
> image is non-primary. Is this wrong?

You have always been able to access non-primary images for read-only operations 
(only writes are prevented):

$ rbd info test
rbd image 'test':
<... snip ...>
mirroring primary: false

$ rbd device --device-type nbd map test@1
/dev/nbd0
$ mount /dev/nbd0 /mnt/
mount: /mnt: WARNING: device write-protected, mounted read-only.
$ ll /mnt/
total 0
-rw-r--r--. 1 root root 0 Nov 21 10:20 hello.world

> Thanks,
> -Vikas
>
> -Original Message-
> From: Jason Dillaman 
> Sent: Thursday, November 21, 2019 9:58 AM
> To: Vikas Rana 
> Cc: ceph-users 
> Subject: Re: [ceph-users] RBD Mirror DR Testing
>
> On Thu, Nov 21, 2019 at 9:56 AM Jason Dillaman  wrote:
> >
> > On Thu, Nov 21, 2019 at 8:49 AM Vikas Rana  wrote:
> > >
> > > Thanks Jason for such a quick response. We are on 12.2.10.
> > >
> > > Checksuming a 200TB image will take a long time.
> >
> > How would mounting an RBD image and scanning the image be faster? 
> > Are you only using a small percentage of the image?
>
> ... and of course, you can mount an RBD snapshot in read-only mode.
>
> > > To test the DR copy by mounting it, these are the steps I'm 
> > > planning to follow 1. Demote the Prod copy and promote the DR copy 
> > > 2. Do we have to recreate the rbd mirror relationship going from DR to 
> > > primary?
> > > 3. Mount and validate the data
> > > 4. Demote the DR copy and promote the Prod copy 5. Revert the peer 
> > > relationship if required?
> > >
> > > Did I do it right or miss anything?
> >
> > You cannot change the peers or you will lose the relationship. If 
> > you insist on your course of action, you just need to be configured 
> > for two-way mirroring and leave it that way.
> >
> > >
> > > Thanks,
> > > -Vikas
> > >
> > > -Original Message-
> > > From: Jason Dillaman 
> > > Sent: Thursday, November 21, 2019 8:33 AM
> > > To: Vikas Rana 
> > > Cc: ceph-users 
> > > Subject: Re: [ceph-users] RBD Mirror DR Testing
> > >
> > > On Thu, Nov 21, 2019 at 8:29 AM Vikas Rana  wrote:
> > > >
> > > > Hi all,
> > > >
> > > >
> > > >
> > > > We have a 200TB RBD image which we are replicating using RBD mirroring.
> > > >
> > > > We want to test the DR copy and make sure that we have a consistent 
> > > > copy in case primary site is lost.
> > > >
> > > >
> > > >
> > > > We did it previously and promoted the DR copy which broken the DR copy 
> > > > from primary and we have to resync the whole 200TB data.
> > > >
> > > >
> > > >
> > > > Is there any correct way of doing it so we don’t have to resync all 
> > > > 200TB again?
> > >
> > > Yes, create a snapshot on the primary site and let it propagate to the 
> > > non-primary site. Then you can compare checksums at the snapshot w/o 
> > > having to worry about the data changing. Once you have finished, delete 
> > > the snapshot on the primary site and it will propagate over to the 
> > > non-primary site.
> > >
> > > >
> > > >
> > > > Can we demote current primary and then promote the DR copy and test and 
> > > > then revert back? Will that require the complete 200TB sync?
> > > >
> > >
> > > It's only the forced-promotion that causes split-brain. If you gracefully 
> > > demote from site A and promote site B, and then demote site B and promote 
> > > site A, that will not require a sync. However, again, it's probably just 
> > > easier to use a snapshot.
> > >
> > > >
> > > > Thanks in advance for your help and suggestions.
> > > >
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > -Vikas
> > > >
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > >
> > >
> > > --
> > > Jason
> > >
> > >
> >
> >
> > --
> > Jason
>
>
>
> --
> Jason
>
>


--
Jason


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dashboard hangs

2019-11-22 Thread Oliver Freyermuth

Hi,

On 2019-11-20 15:55, thoralf schulze wrote:

hi,

we were able to track this down to the auto balancer: disabling the auto
balancer and cleaning out old (and probably not very meaningful)
upmap-entries via ceph osd rm-pg-upmap-items brought back stable mgr
daemons and an usable dashboard.


I can confirm that, in our case I see this on a 14.2.4 cluster (which has 
started its life with an earlier Nautilus version,
and developed this issue over the past weeks) and doing:
 ceph balancer off
has been sufficient to make the mgrs stable again (i.e. I left the upmap-items 
in place).

Interestingly, we did not see this with Luminous or Mimic on different clusters 
(which however have a more stable number of OSDs).

@devs: If there's any more info needed to track this down, please let us know.

Cheers,
Oliver



the not-so-sensible upmap-entries might or might not have been caused by
us updating from mimic to nautilus - it's too late to debug this now.
this seems to be consistent with bryan stillwell's findings ("mgr hangs
with upmap balancer").

thank you very much & with kind regards,
thoralf.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com