[ceph-users] ansible 2.8 for Nautilus

2019-05-20 Thread solarflow99
Does anyone know the necessary steps to install ansible 2.8 in rhel7? I'm assuming most people are doing it with pip? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-05-20 Thread mr. non non
Does anyone have this issue before? As research, many people have issue with rgw.index which related to small small number of index sharding (too many objects per index). I also check on this thread http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033611.html but don't found any

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Maged Mokhtar
Not sure. In general important fixes get backported, but will have to wait and see. /Maged On 20/05/2019 22:11, Frank Schilder wrote: Dear Maged, thanks for elaborating on this question. Is there already information in which release this patch will be deployed? Best regards,

[ceph-users] PG stuck in Unknown after removing OSD - Help?

2019-05-20 Thread Tarek Zegar
Set 3 osd to "out", all were on the same host and should not impact the pool because it's 3x replication and CRUSH is one osd per host. However, now we have one PG stuck UKNOWN. Not sure why this is the case, I do have background writes going on at the time of OSD out. Thoughts? ceph osd tree ID

[ceph-users] CephFS msg length greater than osd_max_write_size

2019-05-20 Thread Ryan Leimenstoll
Hi all, We recently encountered an issue where our CephFS filesystem unexpectedly was set to read-only. When we look at some of the logs from the daemons I can see the following: On the MDS: ... 2019-05-18 16:34:24.341 7fb3bd610700 -1 mds.0.89098 unhandled write error (90) Message too long,

Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd

2019-05-20 Thread Jason Dillaman
On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin wrote: > > Hello cephers, > > we have a few systems which utilize a rbd-bd map/mount to get access to a rbd > volume. > (This problem seems to be related to "[ceph-users] Slow requests from > bluestore osds" (the original thread)) > > Unfortunately

Re: [ceph-users] ceph nautilus namespaces for rbd and rbd image access problem

2019-05-20 Thread Jason Dillaman
On Mon, May 20, 2019 at 11:14 AM Rainer Krienke wrote: > > Am 20.05.19 um 09:06 schrieb Jason Dillaman: > > >> $ rbd --namespace=testnamespace map rbd/rbdtestns --name client.rainer > >> --keyring=/etc/ceph/ceph.keyring > >> rbd: sysfs write failed > >> rbd: error opening image rbdtestns: (1)

[ceph-users] PG stuck down after OSD failures and recovery

2019-05-20 Thread Krzysztof Klimonda
Hi, We’ve observed the following chain of events on our luminous (12.2.8) cluster: To summarize - we’re running this pool with min_size = 1, size = 2. On friday we’ve “temporarily" lost osd.52 (disk changed letter and the host had to be restarted, which we planned on doing after weekend),

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Frank Schilder
Dear Maged, thanks for elaborating on this question. Is there already information in which release this patch will be deployed? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Frank Schilder
If min_size=1 and you loose the last disk, that's end of any data that was only on this disk. Apart from this, using size=2 and min_size=1 is a really bad idea. This has nothing to do with data replication but rather with an inherent problem with high availability and the number 2. You need at

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Maged Mokhtar
On 20/05/2019 19:37, Frank Schilder wrote: This is an issue that is coming up every now and then (for example: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg50415.html) and I would consider it a very serious one (I will give an example below). A statement like "min_size = k is

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Frank Schilder
Dear Paul, thank you very much for this clarification. I believe also ZFS erasure coded data has this property, which is probably the main cause for the expectation of min_size=k. So, basically, min_size=k means that we are on the security level of traditional redundant storage and this may or

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Paul Emmerich
Yeah, the current situation with recovery and min_size is... unfortunate :( The reason why min_size = k is bad is just that it means you are accepting writes without guaranteeing durability while you are in a degraded state. A durable storage system should never tell a client "okay, i've written

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Frank Schilder
This is an issue that is coming up every now and then (for example: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg50415.html) and I would consider it a very serious one (I will give an example below). A statement like "min_size = k is unsafe and should never be set" deserves a bit

Re: [ceph-users] Noob question - ceph-mgr crash on arm

2019-05-20 Thread Torben Hørup
Hi Tcmalloc on arm7 is problematic. You need to compile your own with either jemalloc or just libc malloc /Torben Den 20. maj 2019 17.48.40 CEST, "Jesper Taxbøl" skrev: >I am trying to setup a Ceph cluster on 4 odroid-hc2 instances on top of >Ubuntu 18.04. > >My ceph-mgr deamon keeps crashing

[ceph-users] Noob question - ceph-mgr crash on arm

2019-05-20 Thread Jesper Taxbøl
I am trying to setup a Ceph cluster on 4 odroid-hc2 instances on top of Ubuntu 18.04. My ceph-mgr deamon keeps crashing on me. Any advise on how to proceed? Log on mgr node says something about ms_dispatch: 2019-05-20 15:34:43.070424 b6714230 0 set uid:gid to 64045:64045 (ceph:ceph)

Re: [ceph-users] Could someone can help me to solve this problem about ceph-STS(secure token session)

2019-05-20 Thread Pritha Srivastava
Hello Yuan, While creating the role, can you try setting the Principal to the user you want the role to be assumed by, and the Action to - sts:AssumeRole, like below: policy_document =

Re: [ceph-users] mimic: MDS standby-replay causing blocked ops (MDS bug?)

2019-05-20 Thread Frank Schilder
Dear Yan, thank you for taking care of this. I removed all snapshots and stopped snapshot creation. Please keep me posted. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Yan, Zheng Sent: 20 May 2019 13:34:07

Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-05-20 Thread mr. non non
Hi Manuel, I use version 12.2.8 with bluestore and also use manually index sharding (configured to 100). As I checked, no buckets reach 100k of objects_per_shard. here are health status and cluster log # ceph health detail HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap

Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd

2019-05-20 Thread Marc Schöchlin
Hello cephers, we have a few systems which utilize a rbd-bd map/mount to get access to a rbd volume. (This problem seems to be related to "[ceph-users] Slow requests from bluestore osds" (the original thread)) Unfortunately the rbd-nbd device of a system crashes three mondays in series at

Re: [ceph-users] mimic: MDS standby-replay causing blocked ops (MDS bug?)

2019-05-20 Thread Yan, Zheng
On Sat, May 18, 2019 at 5:47 PM Frank Schilder wrote: > > Dear Yan and Stefan, > > it happened again and there were only very few ops in the queue. I pulled the > ops list and the cache. Please find a zip file here: > "https://files.dtu.dk/u/w6nnVOsp51nRqedU/mds-stuck-dirfrag.zip?l; . Its a bit

Re: [ceph-users] inconsistent number of pools

2019-05-20 Thread Lars Täuber
Mon, 20 May 2019 10:52:14 + Eugen Block ==> ceph-users@lists.ceph.com : > Hi, have you tried 'ceph health detail'? > No I hadn't. Thanks for the hint. ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] inconsistent number of pools

2019-05-20 Thread Eugen Block
Hi, have you tried 'ceph health detail'? Zitat von Lars Täuber : Hi everybody, with the status report I get a HEALTH_WARN I don't know how to get rid of. It my be connected to recently removed pools. # ceph -s cluster: id: 6cba13d1-b814-489c-9aac-9c04aaf78720 health:

[ceph-users] inconsistent number of pools

2019-05-20 Thread Lars Täuber
Hi everybody, with the status report I get a HEALTH_WARN I don't know how to get rid of. It my be connected to recently removed pools. # ceph -s cluster: id: 6cba13d1-b814-489c-9aac-9c04aaf78720 health: HEALTH_WARN 1 pools have many more objects per pg than average

[ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-20 Thread Marc Roos
I got my first problem with cephfs in a production environment. Is it possible from these logfiles to deduct what happened? svr1 is connected to ceph client network via switch svr2 vm is collocated on c01 node. c01 has osd's and the mon.a colocated. svr1 was the first to report errors at

Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-05-20 Thread EDH - Manuel Rios Fernandez
Hi Arnondh, Whats your ceph version? Regards De: ceph-users En nombre de mr. non non Enviado el: lunes, 20 de mayo de 2019 12:39 Para: ceph-users@lists.ceph.com Asunto: [ceph-users] Large OMAP Objects in default.rgw.log pool Hi, I found the same issue like above. Does

[ceph-users] Large OMAP Objects in default.rgw.log pool

2019-05-20 Thread mr. non non
Hi, I found the same issue like above. Does anyone know how to fix it? Thanks. Arnondh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Massive TCP connection on radosgw

2019-05-20 Thread Li Wang
Hi John, Thanks for your reply. We have also restarted the server to get rid of it. Hi All, Does anybody know a better solution than restarting the server? Since we use radosgw in production, we cannot afford service restart on a daily basis. Regards, Li Wang > On 20 May 2019, at 2:48 PM,

Re: [ceph-users] ceph nautilus namespaces for rbd and rbd image access problem

2019-05-20 Thread Rainer Krienke
Am 20.05.19 um 09:06 schrieb Jason Dillaman: >> $ rbd --namespace=testnamespace map rbd/rbdtestns --name client.rainer >> --keyring=/etc/ceph/ceph.keyring >> rbd: sysfs write failed >> rbd: error opening image rbdtestns: (1) Operation not permitted >> In some cases useful info is found in syslog

[ceph-users] Could someone can help me to solve this problem about ceph-STS(secure token session)

2019-05-20 Thread Yuan Minghui
Hello everyone: When I use the method :” assume_role”, like this: sts_client = boto3.client('sts', aws_access_key_id=access_key, aws_secret_access_key=secret_key, endpoint_url=host, ) response =

[ceph-users] Monitor Crash while adding OSD (Luminous)

2019-05-20 Thread Henry Spanka
Hi, I recently upgraded my cluster to Luminous v12.2.11. While adding a new OSD the active monitor crashes (attempt to free invalid pointer). The other mons are still running but the OSD is stuck in new state. Attempting to restart the OSD process will crash the monitor again. Can anybody look

Re: [ceph-users] Major ceph disaster

2019-05-20 Thread Kevin Flöh
Hi Frederic, we do not have access to the original OSDs. We exported the remaining shards of the two pgs but we are only left with two shards (of reasonable size) per pg. The rest of the shards displayed by ceph pg query are empty. I guess marking the OSD as complete doesn't make sense then.

Re: [ceph-users] ceph nautilus namespaces for rbd and rbd image access problem

2019-05-20 Thread Jason Dillaman
On Mon, May 20, 2019 at 9:08 AM Rainer Krienke wrote: > > Hello, > > just saw this message on the client when trying and failing to map the > rbd image: > > May 20 08:59:42 client kernel: libceph: bad option at > '_pool_ns=testnamespace' You will need kernel v4.19 (or later) I believe to utilize

Re: [ceph-users] ceph nautilus namespaces for rbd and rbd image access problem

2019-05-20 Thread Rainer Krienke
Hello, just saw this message on the client when trying and failing to map the rbd image: May 20 08:59:42 client kernel: libceph: bad option at '_pool_ns=testnamespace' Rainer Am 20.05.19 um 08:56 schrieb Rainer Krienke: > Hello, > > on a ceph Nautilus cluster (14.2.1) running on Ubuntu 18.04

Re: [ceph-users] ceph nautilus namespaces for rbd and rbd image access problem

2019-05-20 Thread Jason Dillaman
On Mon, May 20, 2019 at 8:56 AM Rainer Krienke wrote: > > Hello, > > on a ceph Nautilus cluster (14.2.1) running on Ubuntu 18.04 I try to set > up rbd images with namespaces in order to allow different clients to > access only their "own" rbd images in different namespaces in just one > pool. The

[ceph-users] ceph nautilus namespaces for rbd and rbd image access problem

2019-05-20 Thread Rainer Krienke
Hello, on a ceph Nautilus cluster (14.2.1) running on Ubuntu 18.04 I try to set up rbd images with namespaces in order to allow different clients to access only their "own" rbd images in different namespaces in just one pool. The rbd image data are in an erasure encoded pool named "ecpool" and

Re: [ceph-users] Massive TCP connection on radosgw

2019-05-20 Thread John Hearns
I found similar behaviour on a Nautilus cluster on Friday. Around 300 000 open connections which I think were the result of a benchmarking run which was terminated. I restarted the radosgw service to get rid of them. On Mon, 20 May 2019 at 06:56, Li Wang wrote: > Dear ceph community members, >