Hi All,
We have a CephFS cluster running Octopus with three control nodes each running
an MDS, Monitor, and Manager on Ubuntu 20.04. The OS drive on one of these
nodes failed recently and we had to do a fresh install, but made the mistake of
installing Ubuntu 22.04 where Octopus is not availabl
On 2/4/21 7:17 PM, dhils...@performair.com wrote:
hy would I when I can get a 18TB Seagate IronWolf for <$600, a 18TB Seagate Exos for <$500, or a 18TB WD Gold for <$600?
IOPS
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an e
At the end, this is nothing but a probability stuff
Picture this, using size=3, min_size=2:
- One node is down for maintenance
- You loose a couple of devices
- You loose data
Is it likely that a nvme device dies during a short maintenance window ?
Is it likely that two devices dies at the same
9, rum S14
From: Adam Boyhan
Sent: 05 February 2021 13:58:34
To: Frank Schilder
Cc: Jack; ceph-users
Subject: Re: [ceph-users] Re: NVMe and 2x Replica
This turned into a great thread. Lots of good information and clarification.
I am 100% on board with 3 copies for the primary.
On 12/4/19 9:19 AM, Konstantin Shalygin wrote:
> CephFS indeed support snapshots. Since Samba 4.11 support this feature
> too with vfs_ceph_snapshots.
You can snapshot, but you cannot export a diff of snapshots
> ___
> ceph-users mailing list -- ceph-us
1 activating+undersized+degraded+remapped
1 active+recovering+laggy
On 4/8/20 3:27 PM, Jack wrote:
> The CPU is used by userspace, not kernelspace
>
> Here is the perf top, see attachment
>
> Rocksdb eats everything :/
>
>
> On 4/8/20 3:14 PM, Paul Emme
Hi,
You will get slow performance: EC is slow, HDD are slow too
With 400 iops per device, you get 89600 iops for the whole cluster, raw
With 8+3EC, each logical write is mapped to 11 physical writes
You get only 8145 write IOPS (is my math correct ?), which I find very
low for a PB storage
So, u
> Cost of SSD vs. HDD is still in the 6:1 favor of HHD's.
It is not, you can buy fewer thing for less money, with HDDs -that is true
$/TB is better from spinning than from flash, but this is not the most
important indicator, and by far: $/IOPS is another story indeed
On 12/3/19 9:46 PM, jes...@kr
You can snapshot, but you cannot export a diff of snapshots
On 12/4/19 9:19 AM, Konstantin Shalygin wrote:
> On 12/4/19 3:06 AM, Fabien Sirjean wrote:
>> * ZFS on RBD, exposed via samba shares (cluster with failover)
>
> Why not use samba vfs_ceph instead? It's scalable direct access.
>
>> *
Hi,
You can use a full local export, piped to some hash program (this is
what Backurne¹ does) : rbd export - | xxhsum
Then, check the hash consistency with the original
Regards,
[1] https://github.com/JackSlateur/backurne
On 3/3/20 8:46 PM, Stefan Priebe - Profihost AG wrote:
> Hello,
>
> doe
Hi,
Are you exporting rbd image while a VM is running upon it ?
As far as I know, rbd export is not consistent
You should not export an image, but only snapshots:
- create a snapshot of the image
- export the snapshot (rbd export pool/image@snap - | ..)
- drop the snapshot
Regards,
On 3/10/20
Hi,
As the upgrade documentation tells:
> Note that the first time each OSD starts, it will do a format
> conversion to improve the accounting for “omap” data. This may
> take a few minutes to as much as a few hours (for an HDD with lots
> of omap data). You can disable this automatic conversion w
It is not a joke :)
First node is upgraded (and converted), my cluster is currently healing
its degraded objects
On 4/1/20 5:37 PM, Dan van der Ster wrote:
> Doh, I hope so!
>
> On Wed, Apr 1, 2020 at 5:35 PM Marc Roos wrote:
>>
>> April fools day!! :)
>>
>>
>> -Original Message-
>
Hi,
A simple fsck eats the same amount of memory
Cluster usage: rbd with a bit of rgw
Here is the ceph df detail
All OSDs are single rusty devices
On 4/2/20 2:19 PM, Igor Fedotov wrote:
> Hi Jack,
>
> could you please try the following - stop one of already converted OSDs
> and do
(fsck / quick-fix, same story)
On 4/2/20 3:12 PM, Jack wrote:
> Hi,
>
> A simple fsck eats the same amount of memory
>
> Cluster usage: rbd with a bit of rgw
>
> Here is the ceph df detail
> All OSDs are single rusty devices
>
> On 4/2/20 2:19 PM, Igor Fedotov
Correct
$
On 4/2/20 3:17 PM, Igor Fedotov wrote:
> And high memory usage is present for quick-fix after conversion as well,
> isn't it?
>
> The same tens of GBs?
>
>
> On 4/2/2020 4:13 PM, Jack wrote:
>> (fsck / quick-fix, same story)
>>
>> On 4
Here it is
On 4/2/20 3:48 PM, Igor Fedotov wrote:
> And may I have the output for:
>
> ceph daemon osd.N calc_objectstore_db_histogram
>
> This will collect some stats on record types in OSD's DB.
>
>
> On 4/2/2020 4:13 PM, Jack wrote:
>> (fsck / quick-fix,
USED COMPR UNDER COMPR
> rbd 1 245 TiB 245 TiB 9.0 MiB 50.26M 151
> TiB 151 TiB 9.0 MiB 90.03 12 TiB N/A N/A 50.26M
> 35 TiB 144 TiB
>
> Stored - 245 TiB, Used - 151 TiB
>
> Can't imagine any explanation other
wrote:
> Thanks, Jack.
>
> One more question please - what's the actual maximum memory consumption
> for this specific OSD during fsck?
>
> And is it backed by 3, 6 or 10 TB drive ?
>
>
> Regards,
>
> Igor
>
> On 4/2/2020 7:15 PM, Jack wrote:
>> I
Hi,
Checkout rbd status
For instance:
root@ceph5-1:~# rbd status vm-903-disk-1
Watchers:
watcher=10.5.0.39:0/866486904 client.522682726 cookie=140177351959424
This is the list of clients for that image
All mapping hosts are in it
On 4/7/20 6:46 PM, Void Star Nill wrote:
> Hello,
>
> I
Hello,
I've a issue, since my Nautilus -> Octopus upgrade
My cluster has many rbd images (~3k or something)
Each of them has ~30 snapshots
Each day, I create and remove a least a snapshot per image
Since Octopus, when I remove the "nosnaptrim" flags, each OSDs uses 100%
of its CPU time
The whole
I put the nosnaptrim during upgrade because I saw high CPU usage and
though it was somehow related to the upgrade process
However, all my daemon are now running Octopus, and the issue is still
here, so I was wrong
On 4/8/20 1:58 PM, Wido den Hollander wrote:
>
>
> On 4/8/20 1:38 PM, J
eep ?
>
> On Wed, Apr 8, 2020 at 2:03 PM Jack wrote:
>>
>> I put the nosnaptrim during upgrade because I saw high CPU usage and
>> though it was somehow related to the upgrade process
>> However, all my daemon are now running Octopus, and the issue is still
>> he
The CPU is used by userspace, not kernelspace
Here is the perf top, see attachment
Rocksdb eats everything :/
On 4/8/20 3:14 PM, Paul Emmerich wrote:
> What's the CPU busy with while spinning at 100%?
>
> Check "perf top" for a quick overview
>
>
> Paul
>
Samples: 1M of event 'cycles:ppp',
1 activating+undersized+degraded+remapped
1 active+recovering+laggy
On 4/8/20 3:27 PM, Jack wrote:
> The CPU is used by userspace, not kernelspace
>
> Here is the perf top, see attachment
>
> Rocksdb eats everything :/
>
>
> On 4/8/20 3:14 PM, Paul Emmerich
map usage stats"
>
>
>
>
>
> On Thu, 09 Apr 2020 02:15:02 +0800 Jack <mailto:c...@jack.fr.eu.org>
> wrote
>
>
>
> Just to confirm this does not get better:
>
> root@backup1:~# ceph status
> cluster:
> id: 9cd41f0
Hi,
On 4/22/20 11:47 PM, cody.schm...@iss-integration.com wrote:
> Example 1:
> 8x 60-Bay (8TB) Storage nodes (480x 8TB SAS Drives)
> Storage Node Spec:
> 2x 32C 2.9GHz AMD EPYC
>- Documentation mentions .5 cores per OSD for throughput optimized. Are
> they talking about .5 Physical cores
We did have a peering storm, we're past that portion of the backfill and still
experiencing new instances of rbd volumes hanging. It is for sure not just the
peering storm.
We've got 22.184% objects misplaced yet, with a bunch of pgs left to backfill
(like 75k). Our rbd poll is using about 1.7P
28 matches
Mail list logo