[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Matthias Ferdinand
On Tue, May 21, 2024 at 08:54:26PM +, Eugen Block wrote: > It’s usually no problem to shut down a cluster. Set at least the noout flag, > the other flags like norebalance, nobackfill etc won’t hurt either. Then > shut down the servers. I do that all the time with test clusters (they do > have d

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Eugen Block
It’s usually no problem to shut down a cluster. Set at least the noout flag, the other flags like norebalance, nobackfill etc won’t hurt either. Then shut down the servers. I do that all the time with test clusters (they do have data, just not important at all), and I’ve never had data loss

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread adam.ther
Thanks guys, I think ill just risk it cause it's just for backup, then write something up later as a follow up on what happens in-case others want to do similar. I agree it not typical, im a bit of an odd-duck datahorder. Regards, Adam On 5/21/24 14:21, Matt Vandermeulen wrote: I would norm

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Matt Vandermeulen
I would normally vouch for ZFS for this sort of thing, but the mix of drive sizes will be... and inconvenience, at best. You could get creative with the hierarchy (making zraid{2,3} of mirrors of same-sized drives, or something), but it would be far from ideal. I use ZFS for my own home machine

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Marc
> It's all non-corperate data, I'm just trying to cut back on wattage > (removes around 450W of the 2.4KW) by powering down backup servers that 450W for one server seems quite hefty. Under full load? You can also check your cpu power states and frequency that cuts also some power. > > So that

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread adam.ther
Hello, It's all non-corperate data, I'm just trying to cut back on wattage (removes around 450W of the 2.4KW) by powering down backup servers that house 208TB while not being backed up or restoring. ZFS sounds interesting however does it play nice with a mix of drive sizes? That's primarily

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Marc
> > I think it is his lab so maybe it is a test setup for production. > > Home production? A home setup to test on, before he applies changes to his production Saluti 🍷 ;) ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Anthony D'Atri
> I think it is his lab so maybe it is a test setup for production. Home production? > > I don't think it matters to much with scrubbing, it is not like it is related > to how long you were offline. It will scrub just as much being 1 month > offline as being 6 months offline. > >> >> If y

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Marc
I think it is his lab so maybe it is a test setup for production. I don't think it matters to much with scrubbing, it is not like it is related to how long you were offline. It will scrub just as much being 1 month offline as being 6 months offline. > > If you have a single node arguably ZFS wo

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Anthony D'Atri
If you have a single node arguably ZFS would be a better choice. > On May 21, 2024, at 14:53, adam.ther wrote: > > Hello, > > To save on power in my home lab can I have a single node CEPH cluster sit > idle and powered off for 3 months at a time then boot only to refresh > backups? Or will th

[ceph-users] CephFS as Offline Storage

2024-05-21 Thread adam.ther
Hello, To save on power in my home lab can I have a single node CEPH cluster sit idle and powered off for 3 months at a time then boot only to refresh backups? Or will this cause issues I'm unaware of? I'm aware deep-scrubbing will not happen, it would be done when in the boot-up period ever

[ceph-users] Re: rbd-mirror failed to query services: (13) Permission denied

2024-05-21 Thread Stefan Kooman
Hi, On 29-04-2024 17:15, Ilya Dryomov wrote: On Tue, Apr 23, 2024 at 8:28 PM Stefan Kooman wrote: On 23-04-2024 17:44, Ilya Dryomov wrote: On Mon, Apr 22, 2024 at 7:45 PM Stefan Kooman wrote: Hi, We are testing rbd-mirroring. There seems to be a permission error with the rbd-mirror user.

[ceph-users] Re: Cephfs over internet

2024-05-21 Thread adam.ther
Hello, You will want to do this over WireGuard tech from experience, IOPS will be brutal, like 200 IOPS. Wireguard has a few benefits but notably: - Higher rate of transfer per CPU load. - State of the the art protocols. As opposed to some of the more legacy systems. - Extremely

[ceph-users] Re: dkim on this mailing list

2024-05-21 Thread Frank Schilder
Hi Marc, in case you are working on the list server, at least for me the situation seems to have improved no more than 2-3 hours ago. My own e-mails to the list now pass. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___

[ceph-users] dkim on this mailing list

2024-05-21 Thread Marc
Just to confirm if I am messing up my mailserver configs. But currently all messages from this mailing list should generate a dkim pass status? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread Frank Schilder
> Not with the most recent Ceph releases. Actually, this depends. If its SSDs for which IOPs profit from higher iodepth, it is very likely to improve performance, because until today each OSD has only one kv_sync_thread and this is typically the bottleneck with heavy IOPs load. Having 2-4 kv_sy

[ceph-users] Re: Cephfs over internet

2024-05-21 Thread Burkhard Linke
Hi, On 5/21/24 13:39, Marcus wrote: Thanks for your answers! I read somewhere that a vpn would really have an impact on performance, so it was not recommended, and I found v2 protocol. But vpn feels like the solution and you have to accept the lower speed. Also keep in mind that clients hav

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-21 Thread Matthew Vernon
Hi, Returning to this, it looks like the issue wasn't to do with how osd_crush_chooseleaf_type ; I destroyed and re-created my cluster as before, and I have the same problem again: pg 1.0 is stuck inactive for 10m, current state unknown, last acting [] as before, ceph osd tree: root@mos

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-21 Thread Eugen Block
Thanks, Konstantin. It's been a while since I was last bitten by the choose_tries being too low... Unfortunately, I won't be able to verify that... But I'll definitely keep that in mind, or least I'll try to. :-D Thanks! Zitat von Konstantin Shalygin : Hi Eugen On 21 May 2024, at 15:26,

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-21 Thread Konstantin Shalygin
Hi Eugen > On 21 May 2024, at 15:26, Eugen Block wrote: > > step set_choose_tries 100 I think you should try to increase set_choose_tries to 200 Last year we had an Pacific EC 8+2 deployment of 10 racks. And even with 50 hosts, the value of 100 not worked for us k ___

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread Anthony D'Atri
> > > I have additional questions, > We use 13 disk (3.2TB NVMe) per server and allocate one OSD to each disk. In > other words 1 Node has 13 osds. > Do you think this is inefficient? > Is it better to create more OSD by creating LV on the disk? Not with the most recent Ceph releases. I suspec

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread 서민우
I compared your advice to my current setup. I think it worked that increasing 'osd_memory_target'. I changed 4G -> 8G then, the latency of our test decreased about 50%. I have additional questions, We use 13 disk (3.2TB NVMe) per server and allocate one OSD to each disk. In other words 1 Node has

[ceph-users] Re: Cephfs over internet

2024-05-21 Thread Paul Mezzanini
We did a proof of concept moving some compute into "the cloud" and exported our cephfs shares using wireguard as the tunnel. The performance impact on our storage was completely latency and bandwidth dependent with no noticeable impact from the tunnel itself. -paul -- Paul Mezzanini Platf

[ceph-users] Re: Ceph osd df tree takes a long time to respond

2024-05-21 Thread Eugen Block
First thing to try would be to fail the mgr. Although the daemons might be active from a systemd perspective, they sometimes get unresponsive. I saw that in Nautilus clusters as well, so that might be worth a try. Zitat von Huy Nguyen : Ceph version 14.2.7 Ceph osd df tree command take lo

[ceph-users] unknown PGs after adding hosts in different subtree

2024-05-21 Thread Eugen Block
Hi, I got into a weird and unexpected situation today. I added 6 hosts to an existing Pacific cluster (16.2.13, 20 existing OSD hosts across 2 DCs). The hosts were added to the root=default subtree, their designated location is one of two datacenters underneath the default root. Nothing u

[ceph-users] Re: Cephfs over internet

2024-05-21 Thread Marcus
Thanks for your answers! I read somewhere that a vpn would really have an impact on performance, so it was not recommended, and I found v2 protocol. But vpn feels like the solution and you have to accept the lower speed. Thanks again! On tis, maj 21 2024 at 17:07:48 +1000, Malcolm Haak wrote

[ceph-users] How network latency affects ceph performance really with NVME only storage?

2024-05-21 Thread Stefan Bauer
Dear Users, i recently setup a new ceph 3 node cluster. Network is meshed between all nodes (2 x 25G with DAC). Storage is flash only (Kioxia 3.2 TBBiCS FLASH 3D TLC, KCMYXVUG3T20) The latency with ping tests between the nodes shows: # ping 10.1.3.13 PING 10.1.3.13 (10.1.3.13) 56(84) bytes of

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread Frank Schilder
We are using the read-intensive kioxia drives (octopus cluster) in RBD pools and are very happy with them. I don't think its the drives. The last possibility I could think of is CPU. We run 4 OSDs per 1.92TB Kioxia drive to utilize their performance (single OSD per disk doesn't cut it at all) a

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread 서민우
We used the "kioxia kcd6xvul3t20" model. Any infamous information of this Model? 2024년 5월 17일 (금) 오전 2:58, Anthony D'Atri 님이 작성: > If using jumbo frames, also ensure that they're consistently enabled on > all OS instances and network devices. > > > On May 16, 2024, at 09:30, Frank Schilder wrote

[ceph-users] Ceph osd df tree takes a long time to respond

2024-05-21 Thread Huy Nguyen
Ceph version 14.2.7 Ceph osd df tree command take long time than usual but I can't find out what is the reason? The monitor node still has plenty of available RAM and CPU resources. I checked the monitor and mgr log but nothing seems useful. I checked an older cluster in version 13.2.10 but Ceph

[ceph-users] Re: Cephfs over internet

2024-05-21 Thread Malcolm Haak
Yeah, you really want to do this over a vpn. Performance is going to be average at best. It would probably be faster to re-export it as NFS/SMB and push that across the internet. On Mon, May 20, 2024 at 11:37 PM Marc wrote: > > > Hi all, > > Due to so many reasons (political, heating problems, l