[ceph-users] Re: have buckets with low number of shards

2021-11-23 Thread DHilsbos
Manoosh; You can't reshard a bucket without downtime. During a reshard RGW creates new RADOS objects to match the new shard number. Then all the RGW objects are moved from the old RADOS objects to the new RADOS objects, and the original RADOS objects are destroyed. The reshard locks the

[ceph-users] Re: Best way to add multiple nodes to a cluster?

2021-11-02 Thread DHilsbos
Zakhar; When adding nodes I usually set the following: noin (OSDs register as up, but stay out) norebalance (new placement shouldn't be calculated when the cluster layout changes, I've been bit by this not working as expected, so I also set below) nobackfill (PGs don't move) I then remove noin,

[ceph-users] Re: Rebooting one node immediately blocks IO via RGW

2021-10-25 Thread DHilsbos
Troels; This sounds like a failure domain issue. If I remember correctly, Ceph defaults to a failure domain of disk (osd), while you need a failure domain of host. Could you do a ceph -s while one of the hosts is offline? You're looking for the HEALTH_ flag, and any errors other than slow

[ceph-users] Re: failing dkim

2021-10-25 Thread DHilsbos
MJ; A lot of mailing lists "rewrite" the origin address to one that matches the mailing list server. Here's an example from the Samba mailing list: "samba ; on behalf of; Rowland Penny via samba ". This mailing list relays the email, without modifying the sender, or the envelope address.

[ceph-users] Re: How to make HEALTH_ERR quickly and pain-free

2021-10-25 Thread DHilsbos
MJ; Assuming that you have a replicated pool with 3 replicas and min_size = 2, I would think stopping 2 OSD daemons, or 2 OSD containers would guarantee HEALTH_ERR. Similarly, if you have a replicated pool with 2 replicas, still with min_size = 2, stopping 1 OSD should do the trick. Thank

[ceph-users] Re: OSD's fail to start after power loss

2021-10-13 Thread DHilsbos
Todd; What version of ceph are you running? Are you running containers or packages? Was the cluster installed manually, or using a deployment tool? Logs provided are for osd ID 31, is ID 31 appropriate for that server? Have you verified that the ceph.conf on that server is intact, and

[ceph-users] Re: Cluster down

2021-10-13 Thread DHilsbos
Jorge; This sounds, to me, like something to discuss with the proxmox folks. Unless there was an IP conflict between the rebooted server, and one of the existing mons, I can't see the ceph cluster going unavailable. Further, I don't see where anything ceph related would cause hypervisors, on

[ceph-users] Re: Ceph cluster Sync

2021-10-12 Thread DHilsbos
Michel; I am neither a Ceph evangelist, nor a Ceph expert, but here is my current understanding: Ceph clusters do not have in-built cross cluster synchronization. That said, there are several things which might meet your needs. 1) If you're just planning your Ceph deployment, then the latest

[ceph-users] Re: urgent question about rdb mirror

2021-10-01 Thread DHilsbos
Ignazio; If your first attempt at asking a question results in no responses, you might consider why, before reposting. I don't use RBD mirroring, so I can only supply theoretical information. Googling RBD mirroring (for me) results in the below as the first result:

[ceph-users] Re: Leader election loop reappears

2021-09-29 Thread DHilsbos
Manuel; Reading through this mailing list this morning, I can't help but mentally connect your issue to Javier's issue. In part because you're both running 16.2.6. Javier's issue seems to be that OSDs aren't registering public / cluster network addresses correctly. His most recent message

[ceph-users] Re: The reason of recovery_unfound pg

2021-08-20 Thread DHilsbos
Satoru; Ok. What your cluster is telling you, then, is that it doesn't know which replica is the "most current" or "correct" replica. You will need to determine that, and let ceph know which one to use as the "good" replica. Unfortunately, I can't help you with this. In fact, if this is

[ceph-users] Re: The reason of recovery_unfound pg

2021-08-20 Thread DHilsbos
Satoru; You said " after restarting all nodes one by one." After each reboot, did you allow the cluster the time necessary to come back to a "HEALTH_OK" status? Thank you, Dominic L. Hilsbos, MBA Vice President – Information Technology Perform Air International Inc. dhils...@performair.com

[ceph-users] Re: Luminous won't fully recover

2021-07-23 Thread DHilsbos
Sean; These lines look bad: 14 scrub errors Reduced data availability: 2 pgs inactive Possible data damage: 8 pgs inconsistent osd.95 (root=default,host=hqosd8) is down I suspect you ran into a hardware issue with one more drives in some of the servers that did not go offline. osd.95 is

[ceph-users] Re: Issue with Nautilus upgrade from Luminous

2021-07-09 Thread DHilsbos
Suresh; I don't believe we use tunables, so I'm not terribly familiar with them. A quick Google search ("ceph tunable") supplied the following pages: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/1.2.3/html/storage_strategies/crush_tunables

[ceph-users] Re: rgw multisite sync not syncing data, error: RGW-SYNC:data:init_data_sync_status: ERROR: failed to read remote data log shards

2021-06-25 Thread DHilsbos
Christian; Do the second site's RGW instance(s) have access to the first site's OSDs? Is the reverse true? It's been a while since I set up the multi-site sync between our clusters, but I seem to remember that, while metadata is exchanged RGW1<-->RGW2, data is exchanged OSD1<-->RGW2. Anyone

[ceph-users] Re: Strategy for add new osds

2021-06-15 Thread DHilsbos
Personally, when adding drives like this, I set noin (ceph osd set noin), and norebalance (ceph osd set norebalance). Like your situation, we run smaller clusters; our largest cluster only has 18 OSDs. That keeps the cluster from starting data moves until all new drives are in place. Don't

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-02 Thread DHilsbos
Only if you also look at why containers are bad in general, as that also applies to ceph as well. Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Fox, Kevin M

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread DHilsbos
Casey; That makes sense, and I appreciate the explanation. If I were to shut down all uses of RGW, and wait for replication to catch up, would this then address most known issues with running this command in a multi-site environment? Can I offline RADOSGW daemons as an added precaution?

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread DHilsbos
Konstantin; Dynamic resharding is disabled in multisite environments. I believe you mean radosgw-admin reshard stale-instances rm. Documentation suggests this shouldn't be run in a multisite environment. Does anyone know the reason for this? Is it, in fact, safe, even in a multisite

[ceph-users] Revisit Large OMAP Objects

2021-04-13 Thread DHilsbos
All; We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 --> 14.2.16). Initially our bucket grew very quickly, as I was loading old data into it and we quickly ran into Large OMAP Object warnings. I have since done a couple manual reshards, which has fixed the warning on the primary

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread DHilsbos
Igor; Does this only impact CephFS then? Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Igor Fedotov [mailto:ifedo...@suse.de] Sent: Monday, April 12, 2021 9:16

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-12 Thread DHilsbos
Is there a way to check for these zombie blobs, and other issues needing repair, prior to the upgrade? That would allow us to know that issues might be coming, and perhaps address them before they result in corrupt OSDs. I'm considering upgrading our clusters from 14 to 15, and would really

[ceph-users] Re: First 6 nodes cluster with Octopus

2021-03-30 Thread DHilsbos
Mabi; We're running Nautilus, and I am not wholly convinced of the "everything in containers" view of the world, so take this with a small grain of salt... 1) We don't run Ubuntu, sorry. I suspect the documentation highlights 18.04 because it's the current LTS release. Personally, if I had

[ceph-users] Re: 10G stackabe lacp switches

2021-02-16 Thread DHilsbos
Sorry; Netgear M4300 switches, not M4100. Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: dhils...@performair.com [mailto:dhils...@performair.com] Sent: Monday, February 15,

[ceph-users] Re: 10G stackabe lacp switches

2021-02-15 Thread DHilsbos
MJ; I was looking at something similar, and reached out one of my VARs, and they recommended Netgear M4100 series switches. We don't user any of them yet, so I can't provide first-hand experience. On the subject of UTP vs SFP+; I'm told that SFP+ with DAC cables experience lower latencies

[ceph-users] Re: NVMe and 2x Replica

2021-02-04 Thread DHilsbos
My impression is that cost / TB for a drive may be approaching parity, but the TB /drive is still well below (or at least at densities approaching parity, cost / TB is still quite high). I can get a Micron 15TB SSD for $2600, but why would I when I can get a 18TB Seagate IronWolf for <$600, a

[ceph-users] Re: NVMe and 2x Replica

2021-02-04 Thread DHilsbos
Adam; Earlier this week, another thread presented 3 white papers in support of running 2x on NVMe for Ceph. I searched each to find the section where 2x was discussed. What I found was interesting. First, there are really only 2 positions here: Micron's and Red Hat's. Supermicro copies

[ceph-users] Re: Worst thing that can happen if I have size= 2

2021-02-03 Thread DHilsbos
Adam; I'd like to see that / those white papers. I suspect what they're advocating is multiple OSD daemon processes per NVMe device. This is something which can improve performance. Though I've never done it, I believe you partition the device, and then create your OSD pointing at a

[ceph-users] Re: radosgw-admin sync status takes ages to print output

2021-01-14 Thread DHilsbos
Istvan; What version of Ceph are you running? Another email chain indicates you're running on CentOS 8, which suggests Octopus (15). We're running multisite replicated radosgw on Nautilus. I don't see the long running time that you are suggesting, though we only have ~35k objects. I

[ceph-users] Re: Global AVAIL vs Pool MAX AVAIL

2021-01-12 Thread DHilsbos
Mark; Just to clarify; when you say you have "1 replica," does that mean that mean that Replica Size = 2, or Replica Size = 1? Neither of these is good. With Replica Size = 1; if one hard drive (which contains a PG) fails, the entire pool fails. Not just refuses writes, but stops accepting

[ceph-users] Re: Compression of data in existing cephfs EC pool

2021-01-04 Thread DHilsbos
Paul; I'm not familiar with rsync, but is it possible you're running into a system issue of the copies being shallow? In other words, is it possible that you're ending up with a hard-link (2 directory entries pointing to the same initial inode), instead of a deep copy? I believe CephFS is

[ceph-users] Nautilus Health Metrics

2020-12-28 Thread DHilsbos
All; I turned on device health metrics in one of our Nautilus clusters. Unfortunately, it doesn't seem to be collecting any information. When I do "ceph device get-health-metrics , I get the following; { "20200821-223626": { "dev": "/dev/sdc", "error": "smartctl failed",

[ceph-users] Re: CentOS

2020-12-08 Thread DHilsbos
Marc; As if that's not enough confusion (from the FAQ): "Security issues will be updated in CentOS Stream after they are solved in the current RHEL release. Obviously, embargoed security releases can not be publicly released until after the embargo is lifted." Thank you, Dominic L. Hilsbos,

[ceph-users] Re: CentOS

2020-12-08 Thread DHilsbos
Marc; I'm not happy about this, but RedHat is suggesting that those of us running CentOS for production should move to CentOS Stream. As such, I need to determine if the software I'm running on top of it can be run on Stream. Thank you, Dominic L. Hilsbos, MBA Director - Information

[ceph-users] CentOS

2020-12-08 Thread DHilsbos
All; As you may or may not know; this morning RedHat announced the end of CentOS as a rebuild distribution[1]. "CentOS" will be retired in favor of the recently announced "CentOS Stream." Can Ceph be installed on CentOS Stream? Since CentOS Stream is currently at 8, the question really is:

[ceph-users] Re: Ceph on ARM ?

2020-11-24 Thread DHilsbos
Adrian; I've always considered the advantage of ARM to be the reduction in the failure domain. Instead of one server with 2 processors, and 2 power supplies, in 1 case, running 48 disks, you can do 4 cases containing 8 power supplies, and 32 processors running 32 (or 64...) disks. The

[ceph-users] Re: Cephfs snapshots and previous version

2020-11-24 Thread DHilsbos
Oliver; You might consider asking this question of the CentOS folks. Possibly at cen...@centos.org. Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Oliver

[ceph-users] Re: Accessing Ceph Storage Data via Ceph Block Storage

2020-11-17 Thread DHilsbos
Vaughan; An absolute minimal Ceph cluster really needs to be 3 servers, and at that usable space should be 1/3 of raw space (see the archives of this mailing list for many discussions of why size=2 is bad). While it is possible to run other tasks on Ceph servers, memory utilization of Ceph

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-17 Thread DHilsbos
Phil; I'm probably going to get crucified for this, but I put a year of testing into this before determining it was sufficient to the needs of my organization... If the primary concerns are capability and cost (not top of the line performance), then I can tell you that we have had great

[ceph-users] Re: safest way to re-crush a pool

2020-11-10 Thread DHilsbos
Michael; I run a Nautilus cluster, but all I had to do was change the rule associated with the pool, and ceph moved the data. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original

[ceph-users] Re: Fix PGs states

2020-10-30 Thread DHilsbos
This line is telling: 1 osds down This is likely the cause of everything else. Why is one of your OSDs down? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International, Inc. dhils...@performair.com www.PerformAir.com -Original

[ceph-users] Re: Large map object found

2020-10-23 Thread DHilsbos
Peter; As with many things in Ceph, I don’t believe it’s a hard and fast rule (i.e. noon power of 2 will work). I believe the issues are performance, and balance. I can't confirm that. Perhaps someone else on the list will add their thoughts. Has your warning gone away? Thank you,

[ceph-users] Re: Large map object found

2020-10-22 Thread DHilsbos
Peter; I believe shard counts should be powers of two. Also, resharding makes the buckets unavailable, but occurs very quickly. As such it is not done in the background, but in the foreground, for a manual reshard. Notice the statement: "reshard of bucket from to completed

[ceph-users] Re: Large map object found

2020-10-21 Thread DHilsbos
Peter; Look into bucket sharding. Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com From: Peter Eisch [mailto:peter.ei...@virginpulse.com] Sent: Wednesday, October 21, 2020 12:39 PM To:

[ceph-users] Re: Ceph iSCSI Performance

2020-10-06 Thread DHilsbos
Mark; Are you suggesting some other means to configure iSCSI targets with Ceph? If so, how do configure for non-tcmu? The iSCSI clients are not RBD aware, and I can't really make them RBD aware. Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International

[ceph-users] Ceph iSCSI Performance

2020-10-05 Thread DHilsbos
All; I've finally gotten around to setting up iSCSI gateways on my primary production cluster, and performance is terrible. We're talking 1/4 to 1/3 of our current solution. I see no evidence of network congestion on any involved network link. I see no evidence CPU or memory being a problem

[ceph-users] RadosGW and DNS Round-Robin

2020-09-04 Thread DHilsbos
All; We've been running RadosGW on our nautilus cluster for a while, and we're going to be adding iSCSI capabilities to our cluster, via 2 additional servers. I intend to also run RadosGW on these servers. That begs the question of how to "load balance" these servers. I don't believe that we

[ceph-users] Ceph iSCSI Questions

2020-09-04 Thread DHilsbos
All; We've used iSCSI to support virtualization for a while, and have used multi-pathing almost the entire time. Now, I'm looking to move from our single box iSCSI hosts to iSCSI on Ceph. We have 2 independent, non-routed, subnets assigned to iSCSI (let's call them 192.168.250.0/24 and

[ceph-users] Re: Cluster degraded after adding OSDs to increase capacity

2020-08-31 Thread DHilsbos
Dallas; First, I should point out that you have an issue with your units. Your cluster is reporting 81TiB (1024^4) of available space, not 81TB (1000^4). Similarly; it's reporting 22.8 TiB free space in the pool, not 22.8TB. For comparison; your 5.5 TB drives (this is the correct unit here)

[ceph-users] Re: Cluster degraded after adding OSDs to increase capacity

2020-08-28 Thread DHilsbos
Dallas; I would expect so, yes. I wouldn't be surprised to see the used percentage slowly drop as the recovery / rebalance progresses. I believe that the pool free space number is based on the free space of the most filled OSD under any of the PGs, so I expect the free space will go up as

[ceph-users] Re: Cluster degraded after adding OSDs to increase capacity

2020-08-27 Thread DHilsbos
Dallas; It looks to me like you will need to wait until data movement naturally resolves the near-full issue. So long as you continue to have this: io: recovery: 477 KiB/s, 330 keys/s, 29 objects/s the cluster is working. That said, there are some things you can do. 1) The near-full

[ceph-users] Re: Help

2020-08-17 Thread DHilsbos
Randy; Nextcloud is easy, it has a "standard" S3 client capability, though it also has Swift client capability. As a S3 client, it does look for the older path style (host/bucket), rather than Amazons newer DNS style (bucket.host). You can find information on configuring Nextcloud's primary

[ceph-users] Re: How to see files in buckets in radosgw object storage in ceph dashboard.?

2020-08-17 Thread DHilsbos
I would expect that most S3 compatible clients would work with RadosGW. As to adding it to the Ceph dashboard, I don't think that's a good idea. A bucket is a flat namespace. Amazon (and others then did also) added semantics that allow for a pseudo-hierarchical behavior, but it's still based

[ceph-users] Re: osd out vs crush reweight]

2020-07-21 Thread DHilsbos
Marcel; Yep, you're right. I focused in on the last op, and missed the ones above it. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International, Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Marcel Kuiper

[ceph-users] Re: osd out vs crush reweight]

2020-07-21 Thread DHilsbos
Marcel; To answer your question, I don't see anything that would be keeping these PGs on the same node. Someone with more knowledge of how the Crush rules are applied, and the code around these operations, would need to weigh in. I am somewhat curious though; you define racks, and even rooms

[ceph-users] Re: osd out vs crush reweight]

2020-07-21 Thread DHilsbos
Marcel; Sorry, could also send the output of: ceph osd tree Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International, Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: dhils...@performair.com

[ceph-users] Re: osd out vs crush reweight]

2020-07-21 Thread DHilsbos
Marcel; Thank you for the information. Could you send the output of: ceph osd crush rule dump Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International, Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: Marcel Kuiper

[ceph-users] Re: osd out vs crush reweight

2020-07-21 Thread DHilsbos
Marcel; Short answer; yes, it might be expected behavior. PG placement is highly dependent on the cluster layout, and CRUSH rules. So... Some clarifying questions. What version of Ceph are you running? How many nodes do you have? How many pools do you have, and what are their failure domains?

[ceph-users] Thank you!

2020-07-20 Thread DHilsbos
I just want to thank the Ceph community, and the Ceph developers for such a wonderful product. We had a power outage on Saturday, and both Ceph clusters went offline, along with all of our other servers. Bringing Ceph back to full functionality was an absolute breeze, no problems, no hiccups,

[ceph-users] Re: ceph/rados performace sync vs async

2020-07-17 Thread DHilsbos
Daniel; As I said, I don't actually KNOW most of this. As such, what I laid out was conceptual. Ceph would need to be implemented to perform these operations in parallel, or not. Conceptually, those areas where operations can be parallelized, making them parallel would improve wall clock

[ceph-users] Re: ceph/rados performace sync vs async

2020-07-17 Thread DHilsbos
Daniel; How is your pool configured? Replica, or Erasure-Coded? I don't actually know any of this, but... I would expect that a synchronous call to a replica pool (R=3) would look something like this: Client --> PG Master Host (data) PG Master Host --> Local Disk (data) PG Master Host --> PG

[ceph-users] Re: about replica size

2020-07-10 Thread DHilsbos
This keeps coming up, which is not surprising, considering it is a core question. Here's how I look at it: The Ceph team has chosen to default to N+2 redundancy. This is analogous to RAID 6 (NOT RAID 1). The basic reasoning for N+2 in storage is as follows: If you experience downtime (either

[ceph-users] Re: Module 'cephadm' has failed: auth get failed: failed to find client.crash.ceph0-ote in keyring retval:

2020-07-03 Thread DHilsbos
Biohazard; This looks like a fairly simple authentication issue. It looks like the keyring(s) available to the command don't contain a key which meets the commands needs. Have you verified the presence and accuracy of your keys? Thank you, Dominic L. Hilsbos, MBA Director - Information

[ceph-users] Re: Object Gateway not working within the dashboard anymore after network change

2020-07-03 Thread DHilsbos
Hendik; I'm assuming that s3.url.com round robin DNSed to the new interface on each host. I don't see a problem with pointing the dashboard at one of the hosts directly. Though there is no load balancing in that kind of setup. I don't believe the dashboard represents a significant load. If

[ceph-users] Re: Object Gateway not working within the dashboard anymore after network change

2020-07-03 Thread DHilsbos
Hendrik; Since the hostname / FQDN for use by Ceph for you RGW server(s) changed, did you adjust the rgw-api-host setting for the dashboard? The command would be: ceph dashboard set-rgw-api-host Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread DHilsbos
As others have pointed out; setting the failure domain to OSD is dangerous because then all 6 chunks for an object can end up on the same host. 6 hosts really seems like the minimum to mess with EC pools. Adding a bucket type between host and osd seems like a good idea here, if you absolutely

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread DHilsbos
All; This conversation has been fascinating. I'm throwing my hat in the ring, though I know almost nothing about systemd... Completely non-portable, but... Couldn't you write a script to issue the necessary commands to the desired drives, then create a system unit that calls it before OSD

[ceph-users] Re: Can't bind mon to v1 port in Octopus.

2020-06-18 Thread DHilsbos
My understanding is that MONs only configure themselves from the config file at first startup. After that all MONs use the monmap to learn about themselves, and their peers. As such; adding an address to the config file for a running MON, even if you restart / reboot, would not achieve the

[ceph-users] radosgw-admin sync status output

2020-06-10 Thread DHilsbos
All; We've been running our Ceph clusters (Nautilus / 14.2.8) for a while now (roughly 9 months), and I've become curious about the output of the "radosgw-admin sync status" command. Here's a the output from our secondary zone: realm () zonegroup () zone ()

[ceph-users] Re: OSD backups and recovery

2020-05-29 Thread DHilsbos
Jarett; It is and it isn't. Replication can be thought of as continuous backups. Backups, especially as SpiderFox is suggesting, are point-in-time, immutable copies of data. Until they are written over, they don't change, even if the data does. In Ceph's RadosGW (RGW) multi-site replication

[ceph-users] Re: OSD backups and recovery

2020-05-29 Thread DHilsbos
SpiderFox; If you're concerned about ransomware (and you should be), then you should: a) protect the cluster from the internet AND from USERS. b) place another technology between your cluster and your users (I use Nextcloud backed by RadosGW through S3 buckets) c) turn on versioning in your

[ceph-users] Re: General question CephFS or RBD

2020-05-29 Thread DHilsbos
Willi; ZFS on RBD seems like a waste, and overkill. A redundant storage solution on top of a redundant storage solution? You can have multiple file systems within CephFS, the thing to note is that each CephFS MUST have a SEPARATE active MDS. For failover, each should have a secondary MDS,

[ceph-users] Re: OSD backups and recovery

2020-05-29 Thread DHilsbos
Ludek; As a cluster system, Ceph isn't really intended to be backed up. It's designed to take quite a beating, and preserve your data. From a broader disaster recovery perspective, here's how I architected my clusters: Our primary cluster is laid out in such a way that an entire rack can fail

[ceph-users] Re: Ceph and iSCSI

2020-05-29 Thread DHilsbos
BR; I've built my own iSCSI targets (using Fedora and CentOS), and use them in production. I've also built 2 different Ceph clusters. They are completely different. Set aside everything you know about iSCSI, it doesn't apply. Ceph is a clustered object store, it can dynamically expand

[ceph-users] Re: CEPH failure domain - power considerations

2020-05-29 Thread DHilsbos
Phil; I like to refer to basic principles, and design assumptions / choices when considering things like this. I also like to refer to more broadly understood technologies. Finally; I'm still relatively new to Ceph, so here it goes... TLDR: Ceph is (likes to be) double-redundent (like

[ceph-users] Re: Maximum CephFS Filesystem Size

2020-04-01 Thread DHilsbos
All; Another interesting piece of information: the host that mounts the CephFS shows it as 45% full. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air Internationl, Inc. dhils...@performair.com www.PerformAir.com -Original Message- From:

[ceph-users] Maximum CephFS Filesystem Size

2020-04-01 Thread DHilsbos
All; We set up a CephFS on a Nautilus (14.2.8) cluster in February, to hold backups. We finally have all the backups running, and are just waiting for the system reach steady-state. I'm concerned about usage numbers, in the Dashboard Capacity it shows the cluster as 37% used, while under

[ceph-users] Re: octopus upgrade stuck: Assertion `map->require_osd_release >= ceph_release_t::mimic' failed.

2020-03-26 Thread DHilsbos
This is a little beyond my understanding of Ceph, but let me take a crack at it... I've found that Ceph tends to be fairly logical, mostly. require_osd_release looks like a cluster wide configuration value which controls the minimum required version for an OSD daemon to join the cluster.

[ceph-users] Re: ceph ignoring cluster/public_network when initiating TCP connections

2020-03-23 Thread DHilsbos
Liviu; First: what version of Ceph are you running? Second: I don't see a cluster network option in you configuration file? At least for us, running Nautilus, there are no underscores (_) in the options, so our configuration files look like this: [global] auth clust required = cphx

[ceph-users] Re: Link to Nautilus upgrade

2020-03-09 Thread DHilsbos
Peter; Or possibly this: https://docs.ceph.com/docs/master/releases/nautilus/#upgrading-from-mimic-or-luminous Or this: https://docs.ceph.com/docs/master/releases/nautilus/#upgrading-from-pre-luminous-releases-like-jewel Thank you, Dominic L. Hilsbos, MBA Director – Information Technology

[ceph-users] Re: Link to Nautilus upgrade

2020-03-09 Thread DHilsbos
Peter; Might this be what you're after: https://docs.ceph.com/docs/nautilus/install/upgrading-ceph/# Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com From: Peter Eisch

[ceph-users] Re: Hardware feedback before purchasing for a PoC

2020-03-09 Thread DHilsbos
Ignacio; Personally, I like to use hardware for a proof of concept that I can roll over into the final system, or repurpose if the project is denied. As such, I would recommend these: Supermicro 5019A-12TN4 Barebones

[ceph-users] Re: MDS Issues

2020-03-06 Thread DHilsbos
All; When I went to check Wido's suggestion, I found the MDS daemons would start successfully. I obviously found no significant time differences. Sorry for making a mountain out of a mole-hill. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International

[ceph-users] MDS Issues

2020-03-06 Thread DHilsbos
All; We are in the middle of upgrading our primary cluster from 14.2.5 to 14.2.8. Our cluster utilizes 6 MDSs for 3 CephFS file systems. 3 MDSs are collocated with MON/MGR, and 3 MDSs are collocated with OSDs. At this point we have upgraded all 3 of the MON/MDS/MGR servers. The MDS on 2 of

[ceph-users] Re: How can I fix "object unfound" error?

2020-03-05 Thread DHilsbos
Simone; What is your failure domain? If you don't know your failure domain can you provide the CRUSH ruleset for the pool that experienced the "object unfound" error? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com

[ceph-users] Re: Stately MDS Transitions

2020-02-28 Thread DHilsbos
Marc; If I understand that command correctly, it's tells MDS 'c' to disappear, the same as rebooting would, right? Let me just clarify something then... When I run ceph fs dump I get the following: 110248: [v2:10.2.80.10:6800/1470324937,v1:10.2.80.10:6801/1470324937] 'S700041' mds.0.29

[ceph-users] Stately MDS Transitions

2020-02-28 Thread DHilsbos
All; We just started really fiddling with CephFS on our production cluster (Nautilus - 14.2.5 / 14.2.6), and I have a question... Is there a command / set of commands that transitions a standby-replay MDS server to the active role, while swapping the active MDS to standby-replay, or even just

[ceph-users] Re: SSD considerations for block.db and WAL

2020-02-27 Thread DHilsbos
Christian; What is your failure domain? If your failure domain is set to OSD / drive, and 2 OSDs share a DB / WAL device, and that DB / WAL device dies, then portions of the data could drop to read-only (or be lost...). Ceph is really set up to own the storage hardware directly. It doesn't

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread DHilsbos
Rodrigo; Best bet would be to check logs. Check the OSD logs on the affected server. Check cluster logs on the MONs. Check OSD logs on other servers. Your Ceph version(s) and your OS distribution and version would also be useful to help you troubleshoot this OSD flapping issue. Thank you,

[ceph-users] More OMAP Issues

2020-02-04 Thread DHilsbos
All; We're backing to having large OMAP object warnings regarding our RGW index pool. This cluster is now in production, so I can simply dump the buckets / pools and hope everything works out. I did some additional research on this issue, and it looks like I need to (re)shard the bucket

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread DHilsbos
Rodrigo; Are all your hosts using the same IP addresses as before the move? Is the new network structured the same? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message-

[ceph-users] Re: Write i/o in CephFS metadata pool

2020-01-29 Thread DHilsbos
Sammy; I had a thought; since you say the FS has high read activity, but you're seeing large write I/O... is it possible that this is related to atime (Linux last access time)? If I remember my Linux FS basics, atime is stored in the file entry for the file in the directory, and I believe

[ceph-users] No Activity?

2020-01-28 Thread DHilsbos
All; I haven't had a single email come in from the ceph-users list at ceph.io since 01/22/2020. Is there just that little traffic right now? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com

[ceph-users] 0B OSDs

2019-10-25 Thread DHilsbos
All; We're setting up our second cluster, using version 14.2.4, and we've run into a weird issue: all of our OSDs are created with a size of 0 B. Weights are appropriate for the size of the underlying drives, but ceph -s shows this: cluster: id: health: HEALTH_WARN

[ceph-users] 0B OSDs?

2019-10-25 Thread DHilsbos
All; We're setting up our second cluster, using version 14.2.4, and we've run into a weird issue: all of our OSDs are created with a size of 0 B. Weights are appropriate for the size of the underlying drives, but ceph -s shows this: cluster: id: health: HEALTH_WARN

[ceph-users] Re: Manager plugins issues on new ceph-mgr nodes

2019-09-10 Thread DHilsbos
Alexander; What is your operating system? Is it possible that the dashboard module isn't installed? I've run into "Error ENOENT: all mgr daemons do not support module 'dashboard'" on my CentOS 7 machines, where the module is a separate package (I had to use "yum install ceph-mgr-dashboard" to

[ceph-users] [nautilus] Dashboard & RADOSGW

2019-09-10 Thread DHilsbos
All; We're trying to add a RADOSGW instance to our new production cluster, and it's not showing in the dashboard, or in ceph -s. The cluster is running 14.2.2, and the RADOSGW got 14.2.3. systemctl status ceph-radosgw@ rgw.s700037 returns: active (running). ss -ntlp does NOT show port 80.

[ceph-users] Re: RBD, OpenStack Nova, libvirt, qemu-guest-agent, and FIFREEZE: is this working as intended?

2019-08-21 Thread DHilsbos
Florian; Forgive my lack of knowledge of OpenStack, and your environment / use case. Why would you need / want to snapshot an ephemeral disk? Isn't the point of ephemeral storage to not be persistent? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air