[ceph-users] MDS recovery

2023-04-25 Thread jack
Hi All, We have a CephFS cluster running Octopus with three control nodes each running an MDS, Monitor, and Manager on Ubuntu 20.04. The OS drive on one of these nodes failed recently and we had to do a fresh install, but made the mistake of installing Ubuntu 22.04 where Octopus is not availabl

[ceph-users] Re: NVMe and 2x Replica

2021-02-04 Thread Jack
On 2/4/21 7:17 PM, dhils...@performair.com wrote: hy would I when I can get a 18TB Seagate IronWolf for <$600, a 18TB Seagate Exos for <$500, or a 18TB WD Gold for <$600? IOPS ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an e

[ceph-users] Re: NVMe and 2x Replica

2021-02-05 Thread Jack
At the end, this is nothing but a probability stuff Picture this, using size=3, min_size=2: - One node is down for maintenance - You loose a couple of devices - You loose data Is it likely that a nvme device dies during a short maintenance window ? Is it likely that two devices dies at the same

[ceph-users] Re: NVMe and 2x Replica

2021-02-05 Thread Jack
9, rum S14 From: Adam Boyhan Sent: 05 February 2021 13:58:34 To: Frank Schilder Cc: Jack; ceph-users Subject: Re: [ceph-users] Re: NVMe and 2x Replica This turned into a great thread. Lots of good information and clarification. I am 100% on board with 3 copies for the primary.

[ceph-users] Re: Building a petabyte cluster from scratch

2020-05-29 Thread Jack
On 12/4/19 9:19 AM, Konstantin Shalygin wrote: > CephFS indeed support snapshots. Since Samba 4.11 support this feature > too with vfs_ceph_snapshots. You can snapshot, but you cannot export a diff of snapshots > ___ > ceph-users mailing list -- ceph-us

[ceph-users] Re: [Octopus] OSD overloading

2020-06-10 Thread Jack
1 activating+undersized+degraded+remapped 1 active+recovering+laggy On 4/8/20 3:27 PM, Jack wrote: > The CPU is used by userspace, not kernelspace > > Here is the perf top, see attachment > > Rocksdb eats everything :/ > > > On 4/8/20 3:14 PM, Paul Emme

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread Jack
Hi, You will get slow performance: EC is slow, HDD are slow too With 400 iops per device, you get 89600 iops for the whole cluster, raw With 8+3EC, each logical write is mapped to 11 physical writes You get only 8145 write IOPS (is my math correct ?), which I find very low for a PB storage So, u

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread Jack
> Cost of SSD vs. HDD is still in the 6:1 favor of HHD's. It is not, you can buy fewer thing for less money, with HDDs -that is true $/TB is better from spinning than from flash, but this is not the most important indicator, and by far: $/IOPS is another story indeed On 12/3/19 9:46 PM, jes...@kr

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-04 Thread Jack
You can snapshot, but you cannot export a diff of snapshots On 12/4/19 9:19 AM, Konstantin Shalygin wrote: > On 12/4/19 3:06 AM, Fabien Sirjean wrote: >>   * ZFS on RBD, exposed via samba shares (cluster with failover) > > Why not use samba vfs_ceph instead? It's scalable direct access. > >>   *

[ceph-users] Re: consistency of import-diff

2020-03-03 Thread Jack
Hi, You can use a full local export, piped to some hash program (this is what Backurne¹ does) : rbd export - | xxhsum Then, check the hash consistency with the original Regards, [1] https://github.com/JackSlateur/backurne On 3/3/20 8:46 PM, Stefan Priebe - Profihost AG wrote: > Hello, > > doe

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-10 Thread Jack
Hi, Are you exporting rbd image while a VM is running upon it ? As far as I know, rbd export is not consistent You should not export an image, but only snapshots: - create a snapshot of the image - export the snapshot (rbd export pool/image@snap - | ..) - drop the snapshot Regards, On 3/10/20

[ceph-users] [Octopus] Beware the on-disk conversion

2020-04-01 Thread Jack
Hi, As the upgrade documentation tells: > Note that the first time each OSD starts, it will do a format > conversion to improve the accounting for “omap” data. This may > take a few minutes to as much as a few hours (for an HDD with lots > of omap data). You can disable this automatic conversion w

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-01 Thread Jack
It is not a joke :) First node is upgraded (and converted), my cluster is currently healing its degraded objects On 4/1/20 5:37 PM, Dan van der Ster wrote: > Doh, I hope so! > > On Wed, Apr 1, 2020 at 5:35 PM Marc Roos wrote: >> >> April fools day!! :) >> >> >> -Original Message- >

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-02 Thread Jack
Hi, A simple fsck eats the same amount of memory Cluster usage: rbd with a bit of rgw Here is the ceph df detail All OSDs are single rusty devices On 4/2/20 2:19 PM, Igor Fedotov wrote: > Hi Jack, > > could you please try the following - stop one of already converted OSDs > and do

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-02 Thread Jack
(fsck / quick-fix, same story) On 4/2/20 3:12 PM, Jack wrote: > Hi, > > A simple fsck eats the same amount of memory > > Cluster usage: rbd with a bit of rgw > > Here is the ceph df detail > All OSDs are single rusty devices > > On 4/2/20 2:19 PM, Igor Fedotov

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-02 Thread Jack
Correct $ On 4/2/20 3:17 PM, Igor Fedotov wrote: > And high memory usage is present for quick-fix after conversion as well, > isn't it? > > The same tens of GBs? > > > On 4/2/2020 4:13 PM, Jack wrote: >> (fsck / quick-fix, same story) >> >> On 4

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-02 Thread Jack
Here it is On 4/2/20 3:48 PM, Igor Fedotov wrote: > And may I have the output for: > > ceph daemon osd.N calc_objectstore_db_histogram > > This will collect some stats on record types in OSD's DB. > > > On 4/2/2020 4:13 PM, Jack wrote: >> (fsck / quick-fix,

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-02 Thread Jack
    USED COMPR  UNDER COMPR > rbd  1  245 TiB  245 TiB  9.0 MiB   50.26M 151 > TiB  151 TiB  9.0 MiB  90.03 12 TiB  N/A N/A   50.26M  > 35 TiB  144 TiB > > Stored - 245 TiB, Used - 151 TiB > > Can't imagine any explanation other

[ceph-users] Re: [Octopus] Beware the on-disk conversion

2020-04-03 Thread Jack
wrote: > Thanks, Jack. > > One more question please - what's the actual maximum memory consumption > for this specific OSD during fsck? > > And is it backed by 3, 6 or 10 TB  drive ? > > > Regards, > > Igor > > On 4/2/2020 7:15 PM, Jack wrote: >> I

[ceph-users] Re: Fwd: Question on rbd maps

2020-04-07 Thread Jack
Hi, Checkout rbd status For instance: root@ceph5-1:~# rbd status vm-903-disk-1 Watchers: watcher=10.5.0.39:0/866486904 client.522682726 cookie=140177351959424 This is the list of clients for that image All mapping hosts are in it On 4/7/20 6:46 PM, Void Star Nill wrote: > Hello, > > I

[ceph-users] [Octopus] OSD overloading

2020-04-08 Thread Jack
Hello, I've a issue, since my Nautilus -> Octopus upgrade My cluster has many rbd images (~3k or something) Each of them has ~30 snapshots Each day, I create and remove a least a snapshot per image Since Octopus, when I remove the "nosnaptrim" flags, each OSDs uses 100% of its CPU time The whole

[ceph-users] Re: [Octopus] OSD overloading

2020-04-08 Thread Jack
I put the nosnaptrim during upgrade because I saw high CPU usage and though it was somehow related to the upgrade process However, all my daemon are now running Octopus, and the issue is still here, so I was wrong On 4/8/20 1:58 PM, Wido den Hollander wrote: > > > On 4/8/20 1:38 PM, J

[ceph-users] Re: [Octopus] OSD overloading

2020-04-08 Thread Jack
eep ? > > On Wed, Apr 8, 2020 at 2:03 PM Jack wrote: >> >> I put the nosnaptrim during upgrade because I saw high CPU usage and >> though it was somehow related to the upgrade process >> However, all my daemon are now running Octopus, and the issue is still >> he

[ceph-users] Re: [Octopus] OSD overloading

2020-04-08 Thread Jack
The CPU is used by userspace, not kernelspace Here is the perf top, see attachment Rocksdb eats everything :/ On 4/8/20 3:14 PM, Paul Emmerich wrote: > What's the CPU busy with while spinning at 100%? > > Check "perf top" for a quick overview > > > Paul > Samples: 1M of event 'cycles:ppp',

[ceph-users] Re: [Octopus] OSD overloading

2020-04-08 Thread Jack
1 activating+undersized+degraded+remapped 1 active+recovering+laggy On 4/8/20 3:27 PM, Jack wrote: > The CPU is used by userspace, not kernelspace > > Here is the perf top, see attachment > > Rocksdb eats everything :/ > > > On 4/8/20 3:14 PM, Paul Emmerich

[ceph-users] Re: [Octopus] OSD overloading

2020-04-12 Thread Jack
map usage stats" > > > > > > On Thu, 09 Apr 2020 02:15:02 +0800 Jack <mailto:c...@jack.fr.eu.org> > wrote > > > > Just to confirm this does not get better: > > root@backup1:~# ceph status > cluster: > id: 9cd41f0

[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-22 Thread Jack
Hi, On 4/22/20 11:47 PM, cody.schm...@iss-integration.com wrote: > Example 1: > 8x 60-Bay (8TB) Storage nodes (480x 8TB SAS Drives) > Storage Node Spec: > 2x 32C 2.9GHz AMD EPYC >- Documentation mentions .5 cores per OSD for throughput optimized. Are > they talking about .5 Physical cores

[ceph-users] Re: librbd hangs during large backfill

2023-07-20 Thread Jack Hayhurst
We did have a peering storm, we're past that portion of the backfill and still experiencing new instances of rbd volumes hanging. It is for sure not just the peering storm. We've got 22.184% objects misplaced yet, with a bunch of pgs left to backfill (like 75k). Our rbd poll is using about 1.7P