[ceph-users] ceph repo cert expired

2021-03-12 Thread Philip Brown
To whom it may concern: failure: repodata/repomd.xml from Ceph: [Errno 256] No more mirrors to try. https://download.ceph.com/rpm-octopus/el7/x86_64/repodata/repomd.xml: [Errno 14] curl#60 - "Peer's Certificate has expired." ___ ceph-users mailing lis

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread Josh Durgin
Thansk for the reports - sorry we missed this. It's safe to ignore the import error - it's for static type checking in python. https://tracker.ceph.com/issues/49762 We'll release this fix next week. Josh On 3/12/21 10:43 AM, Stefan Kooman wrote: On 3/12/21 6:18 PM, David Caro wrote: I got

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread Stefan Kooman
On 3/12/21 6:18 PM, David Caro wrote: I got the latest docker image from the public docker repo: dcaro@vulcanus$ docker pull ceph/daemon:latest-nautilus latest-nautilus: Pulling from ceph/daemon 2d473b07cdd5: Pull complete 6ab62ee0cbfb: Pull complete 8d5f9072ae2b: Pull complete 5cf35aefd364: Pu

[ceph-users] Re: Removing secondary data pool from mds

2021-03-12 Thread Michael Thomas
Hi Frank, I finally got around to removing the data pool. It went without a hitch. Ironically, about a week before I got around to removing the pool, I suffered the same problem as before, except this time it wasn't a power glitch that took out the OSDs, it was my own careless self who decide

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread Stefan Kooman
On 3/12/21 5:46 PM, David Caro wrote: I might be wrong, but maybe the containers are missing something? The easiest way to check if accessing those directly, but from the looks of it it seems some python packages/installation issue. Adding also more info like 'ceph versions', 'docker images'/

[ceph-users] Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread Stefan Kooman
Hi, After upgrading a Ceph cluster to 14.2.17 with ceph-ansible (docker containers) the manager hits an issue: Module 'volumes' has failed dependency: No module named typing, python trace: 2021-03-12 17:04:22.358 7f299ac75e40 1 mgr[py] Loading python module 'volumes' 2021-03-12 17:04:22.4

[ceph-users] Recommendations on problem with PG

2021-03-12 Thread Gabriel Medve
Hi, We have a problem with a PG that was inconsistent, currently the PG in our cluster have 3 copies. It was not possible for us to repair this pg with "ceph pg repair" (This PG is in osd 14,1,2) so we deleted some of the copies of osd 14 with the following command. ceph-objectstore-tool --d

[ceph-users] Re: Question about delayed write IOs, octopus, mixed storage

2021-03-12 Thread Maged Mokhtar
On 12/03/2021 17:28, Philip Brown wrote: "First it is not a good idea to mix SSD/HDD OSDs in the same pool," Sorry for not being explicit. I used the cephadm/ceph orch facilities and told them "go set up all my disks". SO they automatically set up the SSDs to be WAL devices or whatever. I th

[ceph-users] Re: How big an OSD disk could be?

2021-03-12 Thread Anthony D'Atri
> I assume the limits are those that linux imposes. iops are the limits. One > 20TB has 100 iops and 4x5TB have 400 iops. 400 iops serves more clients that > 100 iops. You decide what you need/want to have. >> Any other aspects on the limits of bigger capacity hard disk drives? > > Recovery wil

[ceph-users] Re: How big an OSD disk could be?

2021-03-12 Thread Martin Verges
A good mix of size and performance is the Seagate 2X14 MACH.2 Dual Actor 14TB HDD. This drive reports as 2x 7TB individual block devices and you install a OSD on each. https://croit.io/blog/benchmarking-seagate-exos2x14-mach-2-hdds We have a bunch of them in a permanent test cluster if someone wa

[ceph-users] Re: Question about delayed write IOs, octopus, mixed storage

2021-03-12 Thread Philip Brown
Here is a partial daemon perf dump, from one of the OSDs. please let me know what else would be useful to look at. "bluefs": { "gift_bytes": 0, "reclaim_bytes": 0, "db_total_bytes": 30006042624, "db_used_bytes": 805298176, "wal_total_bytes": 0,

[ceph-users] Re: How big an OSD disk could be?

2021-03-12 Thread Robert Sander
Am 12.03.21 um 18:30 schrieb huxia...@horebdata.cn: > Any other aspects on the limits of bigger capacity hard disk drives? Recovery will take longer increasing the risk of another failure in the same time. Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin http://

[ceph-users] Re: How big an OSD disk could be?

2021-03-12 Thread huxia...@horebdata.cn
Good point. IOPS is indeed an limitation. Thanks Marc for the tip Any other aspects on the limits of bigger capacity hard disk drives? huxia...@horebdata.cn From: Marc Date: 2021-03-12 18:16 To: huxia...@horebdata.cn; ceph-users Subject: RE: [ceph-users] How big an OSD disk could be? I assu

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread David Caro
I got the latest docker image from the public docker repo: dcaro@vulcanus$ docker pull ceph/daemon:latest-nautilus latest-nautilus: Pulling from ceph/daemon 2d473b07cdd5: Pull complete 6ab62ee0cbfb: Pull complete 8d5f9072ae2b: Pull complete 5cf35aefd364: Pull complete 753fcb95c2ca: Pull complete

[ceph-users] Re: How big an OSD disk could be?

2021-03-12 Thread Marc
I assume the limits are those that linux imposes. iops are the limits. One 20TB has 100 iops and 4x5TB have 400 iops. 400 iops serves more clients that 100 iops. You decide what you need/want to have. > -Original Message- > From: huxia...@horebdata.cn > Sent: 12 March 2021 18:10 > To:

[ceph-users] How big an OSD disk could be?

2021-03-12 Thread huxia...@horebdata.cn
Dear cephers, Just wonder how big an OSD disk could be? Currently the biggest HDD has a capacity of 18TB or 20TB. It is suitable for an OSD still? Is there a limitation of the capacity of a single OSD? Can it be 30TB , 50TB or 100TB a single OSD? What could be the potential limit? best regard

[ceph-users] Re: Container deployment - Ceph-volume activation

2021-03-12 Thread Sebastian Wagner
Am 11.03.21 um 18:40 schrieb 胡 玮文: > Hi, > > Assuming you are using cephadm? Checkout this > https://docs.ceph.com/en/latest/cephadm/osd/#activate-existing-osds > > > ceph cephadm osd activate ... Might not be backported. see https://tracker.ceph.com/issues/46691#note-1 for the workaround

[ceph-users] Re: Question about delayed write IOs, octopus, mixed storage

2021-03-12 Thread Maged Mokhtar
On 12/03/2021 18:25, Philip Brown wrote: well that is a very interesting statistic. Where do you come up with the 30GB partition size limit number? I believe it is using 28GB SSD per HDD disk :-/ So you are implying that if I "throw away" 1/8 of my HDDs, so that I can get that magic number 30G

[ceph-users] Re: Container deployment - Ceph-volume activation

2021-03-12 Thread Cloud Guy
Thanks Everyone.. So it seems like the cephadm activate is a development branch capability currently.. And the workaround is to activate all as legacy daemons, then adopt as containers. Appreciate the inputs. Will try it out. From: Kenneth Waegeman Date: Friday, March 12, 2021 at 10:37 T

[ceph-users] Re: OSD id 241 != my id 248: conversion from "ceph-disk" to "ceph-volume simple" destroys OSDs

2021-03-12 Thread Frank Schilder
Hi Chris, thanks for looking at this issue in more detail. I have two communications on this issue and I'm afraid you didn't get all information. There seem to be at least 2 occurrences of the same bug. Yes, I'm pretty sure data.path should also be a stable device path instead of /dev/sdq1. Bu

[ceph-users] Re: OSDs crashing after server reboot.

2021-03-12 Thread Cassiano Pilipavicius
Thanks again Igor, using the ceph-bluestore-tool with the CEPH_ARGS="--bluestore_hybrid_alloc_mem_cap=0" I was able to detect two OSDs returning IO errors. This OSDs crashing has caused backfills operations that triggered some OSDs marking others as down due to some kind of slow operations and the

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread David Caro
I might be wrong, but maybe the containers are missing something? The easiest way to check if accessing those directly, but from the looks of it it seems some python packages/installation issue. Adding also more info like 'ceph versions', 'docker images'/'docker ps' might also help figuring ou

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread Marc
Python3 14.2.11 is still supporting python2, I can't imagine that a minor update has such a change. Furthermore was el7 officially supported not? > -Original Message- > From: David Caro > Sent: 12 March 2021 17:28 > To: Stefan Kooman > Cc: ceph-users@ceph.io > Subject: [ceph-use

[ceph-users] Re: Ceph 14.2.17 ceph-mgr module issue

2021-03-12 Thread David Caro
That looks like a python version issue (running python2 when it should use python3). Are the container images you use available for the rest of the world to look into? If so you can share the image, that would held debugging. If not, I suggest checking the python version in the containers. On 0

[ceph-users] Re: Question about delayed write IOs, octopus, mixed storage

2021-03-12 Thread Philip Brown
well that is a very interesting statistic. Where do you come up with the 30GB partition size limit number? I believe it is using 28GB SSD per HDD disk :-/ So you are implying that if I "throw away" 1/8 of my HDDs, so that I can get that magic number 30GB+ per HDD, things will magically be improve

[ceph-users] Re: Location of Crush Map and CEPH metadata

2021-03-12 Thread Nathan Fish
Every mon stores the crush map in /var/lib/ceph, I believe. On Fri, Mar 12, 2021 at 10:53 AM Ed Kalk wrote: > > Hello, > > I have been googling for the answer to this and not found it. > Does anyone know this? > > Where does CEPH store the crush map and the critical cluster metadata? > What preve

[ceph-users] Location of Crush Map and CEPH metadata

2021-03-12 Thread Ed Kalk
Hello, I have been googling for the answer to this and not found it. Does anyone know this? Where does CEPH store the crush map and the critical cluster metadata? What prevents a loss of this metadata when a node is lost? -- Thank you for your time, Edward H. Kalk IV Information Technology

[ceph-users] Re: Container deployment - Ceph-volume activation

2021-03-12 Thread Kenneth Waegeman
Hi, The osd activate will probably be nice in the future, but for now I'm doing it like this: ceph-volume activate --all for id in `ls -1 /var/lib/ceph/osd`; do echo cephadm adopt --style legacy --name ${id/ceph-/osd.}; done It's not ideal because you still need the ceph rpms installed and s

[ceph-users] Re: Unhealthy Cluster | Remove / Purge duplicate osds | Fix daemon

2021-03-12 Thread Sebastian Wagner
Hi Oliver, # ssh gedaopl02 # cephadm rm-daemon osd.0 should do the trick. Be careful to remove the broken OSD :-) Best, Sebastian Am 11.03.21 um 22:10 schrieb Oliver Weinmann: > Hi, > > On my 3 node Octopus 15.2.5 test cluster, that I haven't used for quite > a while, I noticed that it shows

[ceph-users] Re: Question about delayed write IOs, octopus, mixed storage

2021-03-12 Thread Philip Brown
"First it is not a good idea to mix SSD/HDD OSDs in the same pool," Sorry for not being explicit. I used the cephadm/ceph orch facilities and told them "go set up all my disks". SO they automatically set up the SSDs to be WAL devices or whatever. ___

[ceph-users] Re: ceph boostrap initialization :: nvme drives not empty after >12h

2021-03-12 Thread Eneko Lacunza
Hi Adrian, El 12/3/21 a las 11:26, Adrian Sevcenco escribió: Hi! yesterday i bootstrapped (with cephadm) my first ceph installation and things looked somehow ok .. but today the osds are not yet ready and i have in dashboard this warnings: MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs

[ceph-users] Re: Ceph server

2021-03-12 Thread Ignazio Cassano
Yes, I noted that more bandwidth is required with this kind of servers. I must reconsider my network infrastructure. Many tanks Ignazio Il giorno ven 12 mar 2021 alle ore 09:26 Robert Sander < r.san...@heinlein-support.de> ha scritto: > Hi, > > Am 10.03.21 um 17:43 schrieb Ignazio Cassano: > > >

[ceph-users] Re: ceph boostrap initialization :: nvme drives not empty after >12h

2021-03-12 Thread Adrian Sevcenco
On 3/12/21 1:26 PM, Andrew Walker-Brown wrote: Hi Adrian, Hi! If you’re just using this for test/familiarity and performance isn’t an issue, then I’d create 3 x VMs on your host server and use them for Ceph. why? i kind of want to be close to a deployment scenario, which most certainly will n

[ceph-users] Re: ceph boostrap initialization :: nvme drives not empty after >12h

2021-03-12 Thread Andrew Walker-Brown
Hi Adrian, If you’re just using this for test/familiarity and performance isn’t an issue, then I’d create 3 x VMs on your host server and use them for Ceph. It’ll work fine, just don’t expect Gb/s in transfer speeds 😊 A> Sent from Mail for Window

[ceph-users] Re: ceph boostrap initialization :: nvme drives not empty after >12h

2021-03-12 Thread Adrian Sevcenco
On 3/12/21 12:31 PM, Eneko Lacunza wrote: Hi Adrian, Hi! El 12/3/21 a las 11:26, Adrian Sevcenco escribió: Hi! yesterday i bootstrapped (with cephadm) my first ceph installation and things looked somehow ok .. but today the osds are not yet ready and i have in dashboard this warnings: MDS_S

[ceph-users] ceph boostrap initialization :: nvme drives not empty after >12h

2021-03-12 Thread Adrian Sevcenco
Hi! yesterday i bootstrapped (with cephadm) my first ceph installation and things looked somehow ok .. but today the osds are not yet ready and i have in dashboard this warnings: MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs PG_AVAILABILITY: Reduced data availability: 64 pgs inactive PG_

[ceph-users] Recover data from Cephfs snapshot

2021-03-12 Thread Jesper Lykkegaard Karlsen
Hi Ceph'ers, I love the possibility to make snapshots on Cephfs systems. Although there is one thing that puzzles me. Creating snapshot takes no time to do and deleting snapshots can bring PGs into snaptrim state for some hours. While recovering data from a snapshot will always invoke a full da

[ceph-users] Re: ERROR: S3 error: 403 (SignatureDoesNotMatch)

2021-03-12 Thread Szabo, Istvan (Agoda)
Fixed, need to remove this entry ... ehhh, spent 3 days on it Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Sz

[ceph-users] Re: ERROR: S3 error: 403 (SignatureDoesNotMatch)

2021-03-12 Thread Szabo, Istvan (Agoda)
Seems like the issue is this line in the radosgw-configuration: rgw_dns_name = It is only binded to the one which is listed there and ignore the cname totally and haproxy ... Is there a way to have 2 rgw_dns_name? When I've pleayed around to put 2 names or 2 complete entries doesn't work. Ist

[ceph-users] mds rank failed. loaded with preallocated inodes that are inconsistent with inotable

2021-03-12 Thread Ch Wan
Hi cephers. Recently we are upgrading ceph cluster from mimic to nautilus. We have 5 ranks and decrease max_mds from 5 down to 1 smoothly. When we set max_mds from 2 to 1, the cluster show that rank 1 is failed, those are mds logs: 2021-03-12 16:21:26.974 7f366e949700 1 mds.1.125077 handle_md

[ceph-users] Re: Question about delayed write IOs, octopus, mixed storage

2021-03-12 Thread Maged Mokhtar
On 12/03/2021 07:05, Philip Brown wrote: I'm running some tests with mixed storage units, and octopus. 8 nodes, each with 2 SSDs, and 8 HDDs . the SSDsare relatively small: around 100GB each. Im mapping 8 rbds, striping them together, and running fio on them for testing. # fio --filename=/...

[ceph-users] Re: Best way to add OSDs - whole node or one by one?

2021-03-12 Thread Dave Hall
Reed, Thank you. This seems like a very well thought approach. Your note about the balancer and the auto_scaler seem quite relevant as well. I'll give it a try when I add my next two nodes. -Dave -- Dave Hall Binghamton University On Thu, Mar 11, 2021 at 5:53 PM Reed Dier wrote: > I'm sure

[ceph-users] Re: Best way to add OSDs - whole node or one by one?

2021-03-12 Thread Andrew Walker-Brown
Dave, Worth just looking at utilisation across your OSD’s. I’ve had Pgs get stuck in backfill-wait-too big when I’ve added new osds. Ceph was unable to move Pg around onto a smaller capacity osd that was quite full. I had to increase the number of pgs (and pg_num) for it to get sorted (and do

[ceph-users] Re: Ceph server

2021-03-12 Thread Robert Sander
Am 10.03.21 um 20:44 schrieb Ignazio Cassano: > 1 small ssd is for operations system and 1 is for mon. Make that a RAID1 set of SSDs and be happier. ;) Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030

[ceph-users] Re: Ceph server

2021-03-12 Thread Robert Sander
Hi, Am 10.03.21 um 17:43 schrieb Ignazio Cassano: > 5 x 8.0TB Intel® SSD DC P4510 Series U.2 PCIe 3.1 x4 NVMe Solid State Drive > Hard Drive > 2 x Intel® 10-Gigabit Ethernet Converged Network Adapter X710-DA2 (2x SFP+) Have you calculated the throughput of 8 NVMe drives against 2x 10G bonded in