[ceph-users] Re: Subscribe

2024-07-25 Thread Anthony D'Atri
Known problem. I’m managing this list manually (think: interactive Python shell) until the CLT finds somebody with the chops to set it up fresh on better infra and not lose the archives. I’ll get dobr...@gmu.edu added. > On Jul 25, 2024, at 8:55 AM, Dan O'Brien wrote:

[ceph-users] Re: Questions about the usage of space in Ceph

2024-07-13 Thread Anthony D'Atri
> > My Ceph cluster has a CephFS file system, using an erasure-code data pool > (k=8, m=2), which has used 14TiB of space. My CephFS has 19 subvolumes, and > each subvolume automatically creates a snapshot every day and keeps it for 3 > days. The problem is that when I manually

[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-12 Thread Anthony D'Atri
raid controller where each disk is in >> its own RAID-0 volume? >> >> I'm just trying to clarify a little bit. You can imagine that nobody wants >> to be that user that does this against the documentation's guidelines and >> then something goes terribly wrong. &

[ceph-users] Re: Help with Mirroring

2024-07-12 Thread Anthony D'Atri
require staging space one doesn’t have if using and transferring files. And of course it’s right out if the RBD volume is currently attached. > > > Zitat von Anthony D'Atri : > >>> >>> I would like to use mirroring to facilitate migrating from an existing >&

[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-12 Thread Anthony D'Atri
rwent a long and arduous engagement with the manufacturer to fix a firmware design flaw. So I don’t place much stock in Dell recommendations. They RAALLLY likes to stuff RoC HBAs down our throats, and they mark them up outrageously. — aad > > Thanks again, > -Drew > > >

[ceph-users] Re: Help with Mirroring

2024-07-11 Thread Anthony D'Atri
> > I would like to use mirroring to facilitate migrating from an existing > Nautilus cluster to a new cluster running Reef. RIght now I'm looking at > RBD mirroring. I have studied the RBD Mirroring section of the > documentation, but it is unclear to me which commands need to be issued on >

[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Anthony D'Atri
> > Isn’t the supported/recommended configuration to use an HBA if you have to > but never use a RAID controller? That may be something I added to the docs. My contempt for RAID HBAs knows no bounds ;) Ceph doesn’t care. Passthrough should work fine, I’ve done that for tends of thousands

[ceph-users] Re: Repurposing some Dell R750s for Ceph

2024-07-11 Thread Anthony D'Atri
Agree with everything Robin wrote here. RAID HBAs FTL. Even in passthrough mode, it’s still an [absurdly expensive] point of failure, but a server in the rack is worth two on backorder. Moreover, I’m told that it is possible to retrofit with cables and possibly an AIC mux / expander. e.g.

[ceph-users] Re: Cluster Alerts

2024-07-03 Thread Anthony D'Atri
https://docs.ceph.com/en/quincy/mgr/crash/ > On Jul 3, 2024, at 08:27, filip Mutterer wrote: > > In my cluster I have old Alerts, how should solved Alerts be handled? > > Just wait until they disappear or Silence them? > > Whats the recommended way? > > > filip >

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2024-07-02 Thread Anthony D'Atri
This was common in the NFS days, and some Linux distribution deliberately slew the execution time. find over an NFS mount was a sure-fire way to horque the server. (e.g. Convex C1) IMHO since the tool relies on a static index it isn't very useful, and I routinely remove any variant from my

[ceph-users] Re: OSD service specs in mixed environment

2024-06-28 Thread Anthony D'Atri
>> >> But this in a spec doesn't match it: >> >> size: '7000G:' >> >> This does: >> >> size: '6950G:' There definitely is some rounding within Ceph, and base 2 vs base 10 shenanigans. > > $ cephadm shell ceph-volume inventory /dev/sdc --format json | jq > .sys_api.human_readable_size

[ceph-users] Re: Viability of NVMeOF/TCP for VMWare

2024-06-27 Thread Anthony D'Atri
There are folks actively working on this gateway and there's a Slack channel. I haven't used it myself yet. My understanding is that ESXi supports NFS. Some people have had good success mounting KRBD volumes on a gateway system or VM and re-exporting via NFS. > On Jun 27, 2024, at 09:01,

[ceph-users] Test after list GC

2024-06-24 Thread Anthony D'Atri
Here’s a test after de-crufting held messages. Grok the fullness. — aad ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Lot of spams on the list

2024-06-24 Thread Anthony D'Atri
I’m not sure if I have access but I can try. > On Jun 24, 2024, at 4:37 PM, Kai Stian Olstad wrote: > > On 24.06.2024 19:15, Anthony D'Atri wrote: >> * Subscription is now moderated >> * The three worst spammers (you know who they are) have been removed >> * I’v

[ceph-users] Re: Lot of spams on the list

2024-06-24 Thread Anthony D'Atri
* Subscription is now moderated * The three worst spammers (you know who they are) have been removed * I’ve deleted tens of thousands of crufty mail messages from the queue The list should work normally now. Working on the backlog of held messages. 99% are bogus, but I want to be careful wrt

[ceph-users] Re: Full list of metrics provided by ceph exporter daemon

2024-06-20 Thread Anthony D'Atri
porter: > ceph_rgw_qactive{instance_id="a"} 0 ceph_rgw_qactive{instance_id="a"} 0 > > > чт, 20 июн. 2024 г. в 20:09, Anthony D'Atri : > >> curl http://endpoint:port/metrics >> >>> On Jun 20, 2024, at 10:15, Peter Razumovsky >> wrote: &g

[ceph-users] Re: Full list of metrics provided by ceph exporter daemon

2024-06-20 Thread Anthony D'Atri
curl http://endpoint:port/metrics > On Jun 20, 2024, at 10:15, Peter Razumovsky wrote: > > Hello! > > I'm using Ceph Reef with Rook v1.13 and want to find somewhere a full list > of metrics exported by brand new ceph exporter daemon. We found that some > metrics have been changed after moving

[ceph-users] Re: How to change default osd reweight from 1.0 to 0.5

2024-06-19 Thread Anthony D'Atri
I’ve thought about this strategy in the past. I think you might enter a cron job to reset any OSDs at 1.0 to 0.5, but really the balancer module or JJ balancer is a better idea than old-style reweight. > On Jun 19, 2024, at 2:22 AM, 서민우 wrote: > > Hello~ > > Our ceph cluster uses 260

[ceph-users] Re: Monitoring

2024-06-18 Thread Anthony D'Atri
Easier to ignore any node_exporter that Ceph (or k8s) deploys and just deploy your own on a different port across your whole fleet. > On Jun 18, 2024, at 13:56, Alex wrote: > > But how do you combine it with Prometheus node exporter built into Ceph?

[ceph-users] Re: Monitoring

2024-06-18 Thread Anthony D'Atri
I don't, I have the fleetwide monitoring / observability systems query ceph_exporter and a fleetwide node_exporter instance on 9101. ymmv. > On Jun 18, 2024, at 09:25, Alex wrote: > > Good morning. > > Our RH Ceph comes with Prometheus monitoring "built in". How does everyone > interstate

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-17 Thread Anthony D'Atri
Ohhh, so multiple OSD failure domains on a single SAN node? I suspected as much. I've experienced a Ceph cluster built on SanDisk InfiniFlash, was was somewhere between SAN and DAS arguably. Each of 4 IF chassis drive 4x OSD nodes via SAS, but it was zoned such that the chassis was the

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-17 Thread Anthony D'Atri
>> >> * We use replicated pools >> * Replica 2, min replicas 1. Note to self: Change the docs and default to discourage this. This is rarely appropriate in production. You had multiple overlapping drive failures? ___ ceph-users mailing list

[ceph-users] Re: why not block gmail?

2024-06-17 Thread Anthony D'Atri
Yes. I have admin juice on some other Ceph lists, I've asked for it here as well so that I can manage with alacrity. > On Jun 17, 2024, at 09:31, Robert W. Eckert wrote: > > Is there any way to have a subscription request validated? > > -Original Message- > From: Marc > Sent:

[ceph-users] Re: Performance issues RGW (S3)

2024-06-13 Thread Anthony D'Atri
, but benefits from a generous pg_num and multiple OSDs so that it isn't bottlenecked. > On Jun 13, 2024, at 15:13, Sinan Polat wrote: > > 500K object size > >> Op 13 jun 2024 om 21:11 heeft Anthony D'Atri het >> volgende geschreven: >> >> 

[ceph-users] Re: Performance issues RGW (S3)

2024-06-13 Thread Anthony D'Atri
e some TCP Retransmissions on the RGW Node, but maybe thats 'normal'. > > Any ideas/suggestions? > > On 2024-06-11 22:08, Anthony D'Atri wrote: >>> I am not sure adding more RGW's will increase the performance. >> That was a tangent. >>> To be clear, that means whatev

[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Anthony D'Atri
Michael > > > From: Daniel Brown > Sent: Wednesday, June 12, 2024 9:18 AM > To: Anthony D'Atri > Cc: Michael Worsham ; ceph-users@ceph.io > > Subject: Re: [ceph-users] Patching Ceph cluster > > This is an external email. Pl

[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Anthony D'Atri
If you have: * pg_num too low (defaults are too low) * pg_num not a power of 2 * pg_num != number of OSDs in the pool * balancer not enabled any of those might result in imbalance. > On Jun 12, 2024, at 07:33, Eugen Block wrote: > > I don't have any good explanation at this point. Can you

[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Anthony D'Atri
Do you mean patching the OS? If so, easy -- one node at a time, then after it comes back up, wait until all PGs are active+clean and the mon quorum is complete before proceeding. > On Jun 12, 2024, at 07:56, Michael Worsham > wrote: > > What is the proper way to patch a Ceph cluster and

[ceph-users] Re: Performance issues RGW (S3)

2024-06-11 Thread Anthony D'Atri
> > I am not sure adding more RGW's will increase the performance. That was a tangent. > To be clear, that means whatever.rgw.buckets.index ? >>> No, sorry my bad. .index is 32 and .data is 256. >> Oh, yeah. Does `ceph osd df` show you at the far right like 4-5 PG replicas >> on each OSD?

[ceph-users] Re: Attention: Documentation - mon states and names

2024-06-11 Thread Anthony D'Atri
Custom names were never really 100% implemented, and I would not be surprised if they don't work in Reef. > On Jun 11, 2024, at 14:02, Joel Davidow wrote: > > Zac, > > Thanks for your super-fast response and action on this. Those four items > are great and the corresponding email as

[ceph-users] Re: About disk disk iops and ultil peak

2024-06-10 Thread Anthony D'Atri
What specifically are your OSD devices? > On Jun 10, 2024, at 22:23, Phong Tran Thanh wrote: > > Hi ceph user! > > I am encountering a problem with IOPS and disk utilization of OSD. Sometimes, > my disk peaks in IOPS and utilization become too high, which affects my > cluster and causes slow

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
>> To be clear, you don't need more nodes. You can add RGWs to the ones you >> already have. You have 12 OSD nodes - why not put an RGW on each? > Might be an option, just don't like the idea to host multiple components on > nodes. But I'll consider it. I really don't like mixing mon/mgr

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
>>> You are right here, but we use Ceph mainly for RBD. It performs 'good >>> enough' for our RBD load. >> You use RBD for archival? > > No, storage for (light-weight) virtual machines. I'm surprised that it's enough, I've seen HDDs fail miserably in that role. > The (CPU) load on the

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-10 Thread Anthony D'Atri
Eh? cf. Mark and Dan's 1TB/s presentation. > On Jun 10, 2024, at 13:58, Mark Lehrer wrote: > > It > seems like Ceph still hasn't adjusted to SSD performance. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Anthony D'Atri
Scrubs are of PGs not OSDs, the lead OSD for a PG orchestrates subops to secondary OSDs. If you can point me to where this is in docs/src I'll clarify it, ideally if you can put in a tracker ticket and send me a link. Scrubbing all PGs on an OSD at once or even in sequence would be impactful.

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
>>> - 20 spinning SAS disks per node. >> Don't use legacy HDDs if you care about performance. > > You are right here, but we use Ceph mainly for RBD. It performs 'good enough' > for our RBD load. You use RBD for archival? >>> - Some nodes have 256GB RAM, some nodes 128GB. >> 128GB is on

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
> Hi all, > > My Ceph setup: > - 12 OSD nodes, 4 OSD nodes per rack. Replication of 3, 1 replica per rack. > - 20 spinning SAS disks per node. Don't use legacy HDDs if you care about performance. > - Some nodes have 256GB RAM, some nodes 128GB. 128GB is on the low side for 20 OSDs. > - CPU

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-09 Thread Anthony D'Atri
> > 6) There are advanced tuning like numa pinning, but you should get decent > speeds without doing fancy stuff. This is why I’d asked the OP for the CPU in use. Mark and Dan’s recent and superlative presentation about 1TB/s with Ceph underscored how tunings can make a very real

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-09 Thread Anthony D'Atri
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-07 Thread Anthony D'Atri
kind of issue if I had to > guess. It's tricky to debug when there is no obvious bottleneck. rados bench is a good smoke test, but fio may better represent the E2E experience. > > Thanks, > Mark > > > > > On Fri, Jun 7, 2024 at 9:47 AM Anthony D'Atri wrote: >

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-07 Thread Anthony D'Atri
Please describe: * server RAM and CPU * osd_memory_target * OSD drive model > On Jun 7, 2024, at 11:32, Mark Lehrer wrote: > > I've been using MySQL on Ceph forever, and have been down this road > before but it's been a couple of years so I wanted to see if there is > anything new here. > >

[ceph-users] Re: tuning for backup target cluster

2024-06-04 Thread Anthony D'Atri
Or partition, or use LVM. I've wondered for years what the practical differences are between using a namespace and a conventional partition. > On Jun 4, 2024, at 07:59, Robert Sander wrote: > > On 6/4/24 12:47, Lukasz Borek wrote: > >> Using cephadm, is it possible to cut part of the NVME

[ceph-users] Re: tuning for backup target cluster

2024-06-03 Thread Anthony D'Atri
t.io/ > > croit GmbH, Freseniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > Web: https://croit.io/ | YouTube: https://goo.gl/PGE1Bx > > > > >> On 29 May 2024, at 21:24, Anthony D'Atri

[ceph-users] Re: tuning for backup target cluster

2024-05-29 Thread Anthony D'Atri
> You also have the metadata pools used by RGW that ideally need to be on NVME. The OP seems to intend shared NVMe for WAL+DB, so that the omaps are on NVMe that way. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-29 Thread Anthony D'Atri
> > Each simultaneously. These are SATA3 with 128mb cache drives. Turn off the cache. > The bus is 6 gb/s. I expect usage to be in the 90+% range not the 50% range. "usage" as measured how? > > On Mon, May 27, 2024 at 5:37 PM Anthony D'Atri wrote: &g

[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-27 Thread Anthony D'Atri
> > hdd iops on the three discs hover around 80 +/- 5. Each or total? I wouldn’t expect much more than 80 per drive. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: tuning for backup target cluster

2024-05-27 Thread Anthony D'Atri
usually down at >> 8 PG’s and that will be limiting things as well. >> >> >> >> Darren Soothill >> >> Want a meeting with me: https://calendar.app.google/MUdgrLEa7jSba3du9 >> >> Looking for help with your Ceph cluster? Contact us at https://c

[ceph-users] Re: tuning for backup target cluster

2024-05-25 Thread Anthony D'Atri
> Hi Everyone, > > I'm putting together a HDD cluster with an ECC pool dedicated to the backup > environment. Traffic via s3. Version 18.2, 7 OSD nodes, 12 * 12TB HDD + > 1NVME each, QLC, man. QLC. That said, I hope you're going to use that single NVMe SSD for at least the index pool. Is

[ceph-users] Re: Ceph ARM providing storage for x86

2024-05-25 Thread Anthony D'Atri
Why not? The hwarch doesn't matter. > On May 25, 2024, at 07:35, filip Mutterer wrote: > > Is this known to be working: > > Setting up the Ceph Cluster with ARM and then use the storage with X86 > Machines for example LXC, Docker and KVM? > > Is this possible? > > Greetings > > filip >

[ceph-users] Re: Best practice regarding rgw scaling

2024-05-23 Thread Anthony D'Atri
I'm interested in these responses. Early this year a certain someone related having good results by deploying an RGW on every cluster node. This was when we were experiencing ballooning memory usage conflicting with K8s limits when running 3. So on the cluster in question we now run 25.

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Anthony D'Atri
> I think it is his lab so maybe it is a test setup for production. Home production? > > I don't think it matters to much with scrubbing, it is not like it is related > to how long you were offline. It will scrub just as much being 1 month > offline as being 6 months offline. > >> >> If

[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Anthony D'Atri
If you have a single node arguably ZFS would be a better choice. > On May 21, 2024, at 14:53, adam.ther wrote: > > Hello, > > To save on power in my home lab can I have a single node CEPH cluster sit > idle and powered off for 3 months at a time then boot only to refresh > backups? Or will

[ceph-users] Re: Please discuss about Slow Peering

2024-05-21 Thread Anthony D'Atri
> > > I have additional questions, > We use 13 disk (3.2TB NVMe) per server and allocate one OSD to each disk. In > other words 1 Node has 13 osds. > Do you think this is inefficient? > Is it better to create more OSD by creating LV on the disk? Not with the most recent Ceph releases. I

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-20 Thread Anthony D'Atri
> On May 20, 2024, at 2:24 PM, Matthew Vernon wrote: > > Hi, > > Thanks for your help! > > On 20/05/2024 18:13, Anthony D'Atri wrote: > >> You do that with the CRUSH rule, not with osd_crush_chooseleaf_type. Set >> that back to the default value

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-20 Thread Anthony D'Atri
> >>> This has left me with a single sad pg: >>> [WRN] PG_AVAILABILITY: Reduced data availability: 1 pg inactive >>>pg 1.0 is stuck inactive for 33m, current state unknown, last acting [] >>> >> .mgr pool perhaps. > > I think so > >>> ceph osd tree shows that CRUSH picked up my racks OK,

[ceph-users] Re: cephadm bootstraps cluster with bad CRUSH map(?)

2024-05-20 Thread Anthony D'Atri
> On May 20, 2024, at 12:21 PM, Matthew Vernon wrote: > > Hi, > > I'm probably Doing It Wrong here, but. My hosts are in racks, and I wanted > ceph to use that information from the get-go, so I tried to achieve this > during bootstrap. > > This has left me with a single sad pg: > [WRN]

[ceph-users] Re: Please discuss about Slow Peering

2024-05-16 Thread Anthony D'Atri
If using jumbo frames, also ensure that they're consistently enabled on all OS instances and network devices. > On May 16, 2024, at 09:30, Frank Schilder wrote: > > This is a long shot: if you are using octopus, you might be hit by this > pglog-dup problem: >

[ceph-users] Re: Upgrading Ceph Cluster OS

2024-05-14 Thread Anthony D'Atri
https://docs.ceph.com/en/latest/start/os-recommendations/#platforms You might want to go to 20.04, then to Reef, then to 22.04 > On May 13, 2024, at 12:22, Nima AbolhassanBeigi > wrote: > > The ceph version is 16.2.13 pacific. > It's deployed using ceph-ansible. (release branch stable-6.0)

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-12 Thread Anthony D'Atri
I halfway suspect that something akin to the speculation in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7MWAHAY7NCJK2DHEGO6MO4SWTLPTXQMD/ is going on. Below are reservations reported by a random OSD that serves (mostly) an EC RGW bucket pool. This is with the mclock

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Anthony D'Atri
>> For our customers we are still disabling mclock and using wpq. Might be >> worth trying. >> >> > Could you please elaborate a bit on the issue(s) preventing the > use of mClock. Is this specific to only the slow backfill rate and/or other > issue? > > This feedback would help prioritize

[ceph-users] Re: Recoveries without any misplaced objects?

2024-04-24 Thread Anthony D'Atri
Do you see *keys* aka omap traffic? Especially if you have RGW set up? > On Apr 24, 2024, at 15:37, David Orman wrote: > > Did you ever figure out what was happening here? > > David > > On Mon, May 29, 2023, at 07:16, Hector Martin wrote: >> On 29/05/2023

[ceph-users] Re: Best practice and expected benefits of using separate WAL and DB devices with Bluestore

2024-04-23 Thread Anthony D'Atri
> On Apr 23, 2024, at 12:24, Maged Mokhtar wrote: > > For nvme:HDD ratio, yes you can go for 1:10, or if you have extra slots you > can use 1:5 using smaller capacity/cheaper nvmes, this will reduce the impact > of nvme failures. On occasion I've seen a suggestion to mirror the fast

[ceph-users] Re: Status of IPv4 / IPv6 dual stack?

2024-04-23 Thread Anthony D'Atri
Sounds like an opportunity for you to submit an expansive code PR to implement it. > On Apr 23, 2024, at 04:28, Marc wrote: > >> I have removed dual-stack-mode-related information from the documentation >> on the assumption that dual-stack mode was planned but never fully >> implemented. >>

[ceph-users] Re: Best practice and expected benefits of using separate WAL and DB devices with Bluestore

2024-04-21 Thread Anthony D'Atri
me-series DB and watch both for drives nearing EOL and their burn rates. > > On Sun, Apr 21, 2024 at 11:06 PM Anthony D'Atri > wrote: >> >> A deep archive cluster benefits from NVMe too. You can use QLC up to 60TB >> in size, 32 of those in one RU makes for a cluste

[ceph-users] Re: Why CEPH is better than other storage solutions?

2024-04-21 Thread Anthony D'Atri
Vendor lock-in only benefits vendors. You’ll pay outrageously for support / maint then your gear goes EOL and you’re trolling eBay for parts. With Ceph you use commodity servers, you can swap 100% of the hardware without taking downtime with servers and drives of your choice. And you get

[ceph-users] Re: Best practice and expected benefits of using separate WAL and DB devices with Bluestore

2024-04-21 Thread Anthony D'Atri
A deep archive cluster benefits from NVMe too. You can use QLC up to 60TB in size, 32 of those in one RU makes for a cluster that doesn’t take up the whole DC. > On Apr 21, 2024, at 5:42 AM, Darren Soothill wrote: > > Hi Niklaus, > > Lots of questions here but let me tray and get through

[ceph-users] Re: Upgrading Ceph 15 to 18

2024-04-21 Thread Anthony D'Atri
, > Malte > >> On 21.04.24 04:14, Anthony D'Atri wrote: >> The party line is to jump no more than 2 major releases at once. >> So that would be Octopus (15) to Quincy (17) to Reef (18). >> Squid (19) is due out soon, so you may want to pause at Quincy until Squid >>

[ceph-users] Re: Upgrading Ceph 15 to 18

2024-04-20 Thread Anthony D'Atri
The party line is to jump no more than 2 major releases at once. So that would be Octopus (15) to Quincy (17) to Reef (18). Squid (19) is due out soon, so you may want to pause at Quincy until Squid is released and has some runtime and maybe 19.2.1, then go straight to Squid from Quincy to

[ceph-users] Re: Mysterious Space-Eating Monster

2024-04-19 Thread Anthony D'Atri
Look for unlinked but open files, it may not be Ceph at fault. Suboptimal logrotate rules can cause this. lsof, fsck -n, etc. > On Apr 19, 2024, at 05:54, Sake Ceph wrote: > > Hi Matthew, > > Cephadm doesn't cleanup old container images, at least with Quincy. After a > upgrade we run the

[ceph-users] Re: Best practice and expected benefits of using separate WAL and DB devices with Bluestore

2024-04-19 Thread Anthony D'Atri
This is a ymmv thing, it depends on one's workload. > > However, we have some questions about this and are looking for some guidance > and advice. > > The first one is about the expected benefits. Before we undergo the efforts > involved in the transition, we are wondering if it is even

[ceph-users] Re: Performance of volume size, not a block size

2024-04-15 Thread Anthony D'Atri
If you're using SATA/SAS SSDs I would aim for 150-200 PGs per OSD as shown by `ceph osd df`. If NVMe, 200-300 unless you're starved for RAM. > On Apr 15, 2024, at 07:07, Mitsumasa KONDO wrote: > > Hi Menguy-san, > > Thank you for your reply. Users who use large IO with tiny volumes are a >

[ceph-users] Re: PG inconsistent

2024-04-12 Thread Anthony D'Atri
If you're using an Icinga active check that just looks for SMART overall-health self-assessment test result: PASSED then it's not doing much for you. That bivalue status can be shown for a drive that is decidedly an ex-parrot. Gotta look at specific attributes, which is thorny since they

[ceph-users] Re: Impact of large PG splits

2024-04-12 Thread Anthony D'Atri
One can up the ratios temporarily but it's all too easy to forget to reduce them later, or think that it's okay to run all the time with reduced headroom. Until a host blows up and you don't have enough space to recover into. > On Apr 12, 2024, at 05:01, Frédéric Nass > wrote: > > > Oh, and

[ceph-users] Re: DB/WALL and RGW index on the same NVME

2024-04-08 Thread Anthony D'Atri
My understanding is that omap and EC are incompatible, though. > On Apr 8, 2024, at 09:46, David Orman wrote: > > I would suggest that you might consider EC vs. replication for index data, > and the latency implications. There's more than just the nvme vs. rotational > discussion to

[ceph-users] Re: Impact of Slow OPS?

2024-04-06 Thread Anthony D'Atri
ISTR that the Ceph slow op threshold defaults to 30 or 32 seconds. Naturally an op over the threshold often means there are more below the reporting threshold. 120s I think is the default Linux op timeout. > On Apr 6, 2024, at 10:53 AM, David C. wrote: > > Hi, > > Do slow ops impact

[ceph-users] Re: Bucket usage per storage classes

2024-04-04 Thread Anthony D'Atri
A bucket may contain objects spread across multiple storage classes, and AIUI the head object is always in the default storage class, so I'm not sure *exactly* what you're after here. > On Apr 4, 2024, at 17:09, Ondřej Kukla wrote: > > Hello, > > I’m playing around with Storage classes in

[ceph-users] Re: RBD image metric

2024-04-04 Thread Anthony D'Atri
> Istvan Szabo > Staff Infrastructure Engineer > --- > Agoda Services Co., Ltd. > e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com> > ------- > > > > _

[ceph-users] Re: question about rbd_read_from_replica_policy

2024-04-04 Thread Anthony D'Atri
Network RTT? > On Apr 4, 2024, at 03:44, Noah Elias Feldt wrote: > > Hello, > I have a question about a setting for RBD. > How exactly does "rbd_read_from_replica_policy" with the value "localize" > work? > According to the RBD documentation, read operations will be sent to the > closest OSD

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Anthony D'Atri
t all, but IMO RGW > index/usage(/log/gc?) pools are always better off using asynchronous > recovery. > > Josh > > On Wed, Apr 3, 2024 at 1:48 PM Anthony D'Atri wrote: >> >> We currently have in src/common/options/global.yaml.in >> >> - name: osd_async_

[ceph-users] Re: RBD image metric

2024-04-03 Thread Anthony D'Atri
Depending on your Ceph release you might need to enable rbdstats. Are you after provisioned, allocated, or both sizes? Do you have object-map and fast-diff enabled? They speed up `rbd du` massively. > On Apr 3, 2024, at 00:26, Szabo, Istvan (Agoda) > wrote: > > Hi, > > Trying to pull out

[ceph-users] Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

2024-04-03 Thread Anthony D'Atri
We currently have in src/common/options/global.yaml.in - name: osd_async_recovery_min_cost type: uint level: advanced desc: A mixture measure of number of current log entries difference and historical missing objects, above which we switch to use asynchronous recovery when

[ceph-users] Re: Questions about rbd flatten command

2024-04-02 Thread Anthony D'Atri
Do these RBD volumes have a full feature set? I would think that fast-diff and objectmap would speed this. > On Apr 2, 2024, at 00:36, Henry lol wrote: > > I'm not sure, but it seems that read and write operations are > performed for all objects in rbd. > If so, is there any method to apply

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Anthony D'Atri
> a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay Looks like you just had an mgr failover? Could be that the secondary mgr hasn't caught up with current events. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: stretch mode item not defined

2024-03-26 Thread Anthony D'Atri
Yes, you will need to create datacenter buckets and move your host buckets under them. > On Mar 26, 2024, at 09:18, ronny.lippold wrote: > > hi there, need some help please. > > we are planning to replace our rbd-mirror setup and go to stretch mode. > the goal is, to have the cluster in 2

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Anthony D'Atri
First try "ceph osd down 89" > On Mar 25, 2024, at 15:37, Alexander E. Patrakov wrote: > > On Mon, Mar 25, 2024 at 7:37 PM Torkil Svensgaard wrote: >> >> >> >> On 24/03/2024 01:14, Torkil Svensgaard wrote: >>> On 24-03-2024 00:31, Alexander E. Patrakov wrote: Hi Torkil, >>> >>> Hi

[ceph-users] Re: Are we logging IRC channels?

2024-03-23 Thread Anthony D'Atri
I fear this will raise controversy, but in 2024 what’s the value in perpetuating an interface from early 1980s BITnet batch operating systems? > On Mar 23, 2024, at 5:45 AM, Janne Johansson wrote: > >> Sure! I think Wido just did it all unofficially, but afaik we've lost >> all of those

[ceph-users] Re: Reef (18.2): Some PG not scrubbed/deep scrubbed for 1 month

2024-03-22 Thread Anthony D'Atri
Perhaps emitting an extremely low value could have value for identifying a compromised drive? > On Mar 22, 2024, at 12:49, Michel Jouvin > wrote: > > Frédéric, > > We arrived at the same conclusions! I agree that an insane low value would be > a good addition: the idea would be that the

[ceph-users] Re: High OSD commit_latency after kernel upgrade

2024-03-22 Thread Anthony D'Atri
n this forum people recommend upgrading "M3CR046" > https://forums.unraid.net/topic/134954-warning-crucial-mx500-ssds-world-of-pain-stay-away-from-these/ > But actually in my ud cluster all the drives are "M3CR045" and have lower > latency. I'm really confused. >

[ceph-users] Re: High OSD commit_latency after kernel upgrade

2024-03-22 Thread Anthony D'Atri
https://askubuntu.com/questions/1454997/how-to-stop-sys-from-changing-usb-ssd-provisioning-mode-from-unmap-to-full-in-ub How to stop sys from changing USB SSD provisioning_mode from unmap to full in Ubuntu 22.04? askubuntu.com ? > On Mar 22, 2024, at 09:36, Özkan Göksu wrote: > > Hello! > >

[ceph-users] Re: Need easy way to calculate Ceph cluster space for SolarWinds

2024-03-20 Thread Anthony D'Atri
37ssd 18.19040 1.0 18 TiB 13 TiB 13 TiB 13 GiB 53 GiB > 5.0 TiB 72.78 1.21 179 up > 43ssd 18.19040 1.0 18 TiB 8.9 TiB 8.8 TiB 17 GiB 23 GiB > 9.3 TiB 48.71 0.81 178 up >TOTAL 873 TiB 527 TiB 525 Ti

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Anthony D'Atri
Grep through the ls output for ‘rados bench’ leftovers, it’s easy to leave them behind. > On Mar 20, 2024, at 5:28 PM, Igor Fedotov wrote: > > Hi Thorne, > > unfortunately I'm unaware of any tools high level enough to easily map files > to rados objects without deep undestanding how this

[ceph-users] Re: Need easy way to calculate Ceph cluster space for SolarWinds

2024-03-20 Thread Anthony D'Atri
ult.rgw.jv-comm-pool.non-ec 6432 0 B0 0 B 0 > 61 TiB > default.rgw.jv-va-pool.data 6532 4.8 TiB 22.17M 14 TiB 7.28 > 61 TiB > default.rgw.jv-va-pool.index 6632 38 GiB 401 113 GiB 0.06 > 61 TiB > default.rg

[ceph-users] Re: Need easy way to calculate Ceph cluster space for SolarWinds

2024-03-20 Thread Anthony D'Atri
> On Mar 20, 2024, at 14:42, Michael Worsham > wrote: > > Is there an easy way to poll a Ceph cluster to see how much space is available `ceph df` The exporter has percentages per pool as well. > and how much space is available per bucket? Are you using RGW quotas? > > Looking for a

[ceph-users] Re: Reef (18.2): Some PG not scrubbed/deep scrubbed for 1 month

2024-03-20 Thread Anthony D'Atri
Suggest issuing an explicit deep scrub against one of the subject PGs, see if it takes. > On Mar 20, 2024, at 8:20 AM, Michel Jouvin > wrote: > > Hi, > > We have a Reef cluster that started to complain a couple of weeks ago about > ~20 PGs (over 10K) not scrubbed/deep-scrubbed in time.

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Anthony D'Atri
> Those files are VM disk images, and they're under constant heavy use, so yes- > there/is/ constant severe write load against this disk. Why are you using CephFS for an RBD application? ___ ceph-users mailing list -- ceph-users@ceph.io To

[ceph-users] Re: ceph osd different size to create a cluster for Openstack : asking for advice

2024-03-13 Thread Anthony D'Atri
NVMe (hope those are enterprise not client) drives aren't likely to suffer the same bottlenecks as HDDs or even SATA SSDs. And a 2:1 size ratio isn't the largest I've seen. So I would just use all 108 OSDs as a single device class and spread the pools across all of them. That way you won't

[ceph-users] Re: Remove cluster_network without routing

2024-03-07 Thread Anthony D'Atri
I think heartbeats will failover to the public network if the private doesn't work -- may not have always done that. >> Hi >> Cephadm Reef 18.2.0. >> We would like to remove our cluster_network without stopping the cluster and >> without having to route between the networks. >> global

[ceph-users] Re: bluestore_min_alloc_size and bluefs_shared_alloc_size

2024-03-06 Thread Anthony D'Atri
> On Feb 28, 2024, at 17:55, Joel Davidow wrote: > > Current situation > - > We have three Ceph clusters that were originally built via cephadm on octopus > and later upgraded to pacific. All osds are HDD (will be moving to wal+db on > SSD) and were resharded after the

[ceph-users] Re: Ceph is constantly scrubbing 1/4 of all PGs and still have pigs not scrubbed in time

2024-03-06 Thread Anthony D'Atri
I don't see these in the config dump. I think you might have to apply them to `global` for them to take effect, not just `osd`, FWIW. > I have tried various settings, like osd_deep_scrub_interval, osd_max_scrubs, > mds_max_scrub_ops_in_progress etc. > All those get ignored.

[ceph-users] Re: Number of pgs

2024-03-05 Thread Anthony D'Atri
the right how many PG replicas are on each OSD. > On Mar 5, 2024, at 14:50, Nikolaos Dandoulakis wrote: > > Hi Anthony, > > I should have said, it’s replicated (3) > > Best, > Nick > > Sent from my phone, apologies for any typos! > From: Anthony D'Atri > Sen

  1   2   3   4   5   6   >