[ceph-users] Re: Ceph OIDC Integration

2020-10-20 Thread Pritha Srivastava
Hello, The next Octopus release should be there in 3-4 weeks. In Octopus, shadow users aren't created ((for federated oidc users). But we later realised that shadow users are needed to maintain user stats, hence the code for the same is under the process of being added as of now and should be

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-20 Thread Frank Schilder
Dear Michael, > > Can you create a test pool with pg_num=pgp_num=1 and see if the PG gets an > > OSD mapping? I meant here with crush rule replicated_host_nvme. Sorry, forgot. > Yes, the OSD was still out when the previous health report was created. Hmm, this is odd. If this is correct, then

[ceph-users] Huge RAM Ussage on OSD recovery

2020-10-20 Thread Ing . Luis Felipe Domínguez Vega
Hi, today mi Infra provider has a blackout, then the Ceph was try to recover but are in an inconsistent state because many OSD can recover itself because the kernel kill it by OOM. Even now one OSD that was OK, go down by OOM killed. Even in a server with 32GB RAM the OSD use ALL that and

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-20 Thread Michael Thomas
On 10/20/20 1:18 PM, Frank Schilder wrote: Dear Michael, Can you create a test pool with pg_num=pgp_num=1 and see if the PG gets an OSD mapping? I meant here with crush rule replicated_host_nvme. Sorry, forgot. Seems to have worked fine: https://pastebin.com/PFgDE4J1 Yes, the OSD was

[ceph-users] v14.2.12 Nautilus released

2020-10-20 Thread David Galloway
This is the 12th backport release in the Nautilus series. This release brings a number of bugfixes across all major components of Ceph. We recommend that all Nautilus users upgrade to this release. For a detailed release notes with links & changelog please refer to the official blog entry at

[ceph-users] Re: Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Emanuel Alejandro Castelli
Hello Eugen, Rebooted the other two MONs fixed the problem root@osswrkprbe001:~# ceph status cluster: id: 56820176-ae5b-4e58-84a2-442b2fc03e6d health: HEALTH_OK services: mon: 3 daemons, quorum osswrkprbe001,osswrkprbe002,osswrkprbe003 (age 3m) mgr:

[ceph-users] Re: Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Emanuel Alejandro Castelli
And the same for MON3 [5243018.443159] libceph: mon1 192.168.14.151:6789 socket closed (con state CONNECTING) [5243033.801504] libceph: mon2 192.168.14.152:6789 socket error on write [5243034.473450] libceph: mon2 192.168.14.152:6789 socket error on write [5243035.497397] libceph: mon2

[ceph-users] Re: Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Emanuel Alejandro Castelli
From MON1, dmesg I get this: [3348025.306195] libceph: mon1 192.168.14.151:6789 socket closed (con state CONNECTING) [3348033.241973] libceph: mon1 192.168.14.151:6789 socket closed (con state CONNECTING) [3348048.089325] libceph: mon2 192.168.14.152:6789 socket closed (con state CONNECTING)

[ceph-users] Re: Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Emanuel Alejandro Castelli
I have 3 MON, I don't know why it's showing only one. root@osswrkprbe001:~# ceph --connect-timeout 60 status Cluster connection interrupted or timed out cephadm logs --name mon.osswrkprbe001 --> Is there any way to go to a specific date? Because it stars from Oct 4. I want to check from Oct 16

[ceph-users] Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Emanuel Alejandro Castelli
Hello I'm facing an issue with ceph. I cannot run any ceph command. It literally hangs. I need to hit CTRL-C to get this: ^CCluster connection interrupted or timed out This is on Ubuntu 16.04. Also, I use Graphana with Prometheus to get information from the cluster, but now there

[ceph-users] Re: pool pgp_num not updated

2020-10-20 Thread Mac Wynkoop
OK, so for interventions, I've pushed these configs out: ceph config set mon.* target_max_misplaced_ratio 0.05 > 0.20 ceph config get osd.* osd_max_backfills 1 > 4 ceph config set osd.* osd_recovery_max_active 1 > 4 And also ran injectargs to push the changes to the OSDs hot. I'll monitor it

[ceph-users] Re: pool pgp_num not updated

2020-10-20 Thread Eugen Block
The default for max misplaced objects is this (5%): ceph-node1:~ # ceph config get mon target_max_misplaced_ratio 0.05 You can increase this for the splitting process but I would recommend to rollback as soon as the splitting has finished. Zitat von Lindsay Mathieson : On 20/10/2020

[ceph-users] Re: pool pgp_num not updated

2020-10-20 Thread Lindsay Mathieson
On 20/10/2020 11:38 pm, Mac Wynkoop wrote: Autoscaler isn't on, what part of Ceph is handling the increase of pgp_num? Because I'd like to turn up the rate at which it splits the PG's, but if autoscaler isn't doing it, I'd have no clue what to adjust. Any ideas? Normal recovery ops I imagine -

[ceph-users] Re: pool pgp_num not updated

2020-10-20 Thread Mac Wynkoop
Alrighty, so we're all recovered and balanced at this point, but I'm not seeing this behavior: *pool 40 'hou-ec-1.rgw.buckets.data' erasure size 9 min_size 7 crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1109 pgp_num_target 2048 last_change 8654141 lfor 0/0/8445757 flags

[ceph-users] Re: Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Eugen Block
Your 'cephadm ls' output was only from one node, I assumed you just bootstrapped the first node. The 'cephadm logs' command should provide pager-output so you can scroll or search for a specific date. I'm not sure what caused this but "error on write" is bad. As I already wrote check the

[ceph-users] Re: Problems with ceph command - Octupus - Ubuntu 16.04

2020-10-20 Thread Eugen Block
Your mon container seems up and running, have you tried restarting it? You just have one mon, is that correct? Do you see anything in the logs? cephadm logs --name mon.osswrkprbe001 How long do you wait until you hit CTRL-C? There's a connection-timeout option for ceph commands, maybe try a

[ceph-users] ceph octopus centos7, containers, cephadm

2020-10-20 Thread Marc Roos
I am running Nautilus on centos7. Does octopus run similar as nautilus thus: - runs on el7/centos7 - runs without containers by default - runs without cephadm by default ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: Ceph Octopus

2020-10-20 Thread Eugen Block
I wonder if this would be impactful, even if `nodown` were set. When a given OSD latches onto the new replication network, I would expect it to want to use it for heartbeats — but when its heartbeat peers aren’t using the replication network yet, they won’t be reachable. I also expected

[ceph-users] RE Re: Recommended settings for PostgreSQL

2020-10-20 Thread Marc Roos
I wanted to create a few statefull containers with mysql/postgres that did not depend on local persistant storage, so I can dynamically move them around. What about using; - a 1x replicated pool and use rbd mirror, - or having postgres use 2 1x replicated pools - or upon task launch create

[ceph-users] Re: Ceph Octopus

2020-10-20 Thread Anthony D'Atri
I wonder if this would be impactful, even if `nodown` were set. When a given OSD latches onto the new replication network, I would expect it to want to use it for heartbeats — but when its heartbeat peers aren’t using the replication network yet, they won’t be reachable. Unless something has

[ceph-users] Re: Ceph Octopus

2020-10-20 Thread Eugen Block
Hi, a quick search [1] shows this: ---snip--- # set new config ceph config set global cluster_network 192.168.1.0/24 # let orchestrator reconfigure the daemons ceph orch daemon reconfig mon.host1 ceph orch daemon reconfig mon.host2 ceph orch daemon reconfig mon.host3 ceph orch daemon reconfig

[ceph-users] Re: Mon DB compaction MON_DISK_BIG

2020-10-20 Thread Szabo, Istvan (Agoda)
Okay, thank you very much. From: Anthony D'Atri Sent: Tuesday, October 20, 2020 9:32 AM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: Mon DB compaction MON_DISK_BIG Email received from outside the company. If in doubt

[ceph-users] Re: Mon DB compaction MON_DISK_BIG

2020-10-20 Thread Szabo, Istvan (Agoda)
Hi, Yeah, sequentially and waited for finish, and it looks like it is still doing something in the background because now it is 9.5GB even if it tells compaction done. I think the ceph tell compact initiated harder so not sure how far it will go down, but looks promising. When I sent the email

[ceph-users] Mon DB compaction MON_DISK_BIG

2020-10-20 Thread Szabo, Istvan (Agoda)
Hi, I've received a warning today morning: HEALTH_WARN mons monserver-2c01,monserver-2c02,monserver-2c03 are using a lot of disk space MON_DISK_BIG mons monserver-2c01,monserver-2c02,monserver-2c03 are using a lot of disk space mon.monserver-2c01 is 15.3GiB >= mon_data_size_warn (15GiB)