[ceph-users] Re: Question about erasure coding on cephfs

2024-03-03 Thread Patrick Begou

Hi Erich,

about a similar problem I asked some months ago,Frank Schilder published 
this on the list (December 6, 2023) and it may be helpfull for your 
setup. I've not tested yet, my cluster is still in deployment state.


   To provide some first-hand experience, I was operating a pool with a 6+2 EC 
profile on 4 hosts for a while (until we got more hosts) and the "subdivide a 
physical host into 2 crush-buckets" approach is actually working best (I basically 
tried all the approaches described in the linked post and they all had pitfalls).

   Procedure is more or less:

   - add second (logical) host bucket for each physical host by suffixing the host name with 
"-B" (ceph osd crush add-bucket   )
   - move half the OSDs per host to this new host bucket (ceph osd crush 
move osd.ID host=HOSTNAME-B)
   - make this location persist reboot of the OSDs (ceph config set osd.ID 
crush_location host=HOSTNAME-B")

   This will allow you to move OSDs back easily when you get more hosts and can 
afford the recommended 1 shard per host. It will also show which and where OSDs are moved 
to with a simple "ceph config dump | grep crush_location". Bets of all, you 
don't have to fiddle around with crush maps and hope they do what you want. Just use 
failure domain host and you are good. No more than 2 host buckets per physical host means 
no more than 2 shards per physical host with default placement rules.

   I was operating this set-up with min_size=6 and feeling bad about it due 
to the reduced maintainability (risk of data loss during maintenance). Its not 
great really, but sometimes there is no way around it. I was happy when I got 
the extra hosts.

Patrick

Le 02/03/2024 à 16:37, Erich Weiler a écrit :

Hi Y'all,

We have a new ceph cluster online that looks like this:

md-01 : monitor, manager, mds
md-02 : monitor, manager, mds
md-03 : monitor, manager
store-01 : twenty 30TB NVMe OSDs
store-02 : twenty 30TB NVMe OSDs

The cephfs storage is using erasure coding at 4:2.  The crush domain 
is set to "osd".


(I know that's not optimal but let me get to that in a minute)

We have a current regular single NFS server (nfs-01) with the same 
storage as the OSD servers above (twenty 30TB NVME disks).  We want to 
wipe the NFS server and integrate it into the above ceph cluster as 
"store-03".  When we do that, we would then have three OSD servers.  
We would then switch the crush domain to "host".


My question is this:  Given that we have 4:2 erasure coding, would the 
data rebalance evenly across the three OSD servers after we add 
store-03 such that if a single OSD server went down, the other two 
would be enough to keep the system online?  Like, with 4:2 erasure 
coding, would 2 shards go on store-01, then 2 shards on store-02, and 
then 2 shards on store-03?  Is that how I understand it?


Thanks for any insight!

-erich
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [Quincy] NFS ingress mode haproxy-protocol not recognized

2024-03-03 Thread wodel youchi
Hi;

I tried to create an NFS cluster using this command :
[root@controllera ceph]# ceph nfs cluster create mynfs "3 controllera
controllerb controllerc" --ingress --virtual_ip 20.1.0.201 --ingress-mode
haproxy-protocol
Invalid command: haproxy-protocol not in default|keepalive-only

And I got this error : Invalid command haproxy-protocol
I am using Quincy : ceph version 17.2.7 (...) quincy (stable)

Is it not supported yet?

Regards.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Quincy] NFS ingress mode haproxy-protocol not recognized

2024-03-03 Thread Adam King
According to https://tracker.ceph.com/issues/58933, that was only
backported as far as reef. If I remember correctly, the reason for that was
the ganehsa version itself we were including in our quincy containers
wasn't new enough to support the feature on that end, so backporting the
nfs/orchestration side of it wouldn't have been useful.

On Sun, Mar 3, 2024 at 8:25 AM wodel youchi  wrote:

> Hi;
>
> I tried to create an NFS cluster using this command :
> [root@controllera ceph]# ceph nfs cluster create mynfs "3 controllera
> controllerb controllerc" --ingress --virtual_ip 20.1.0.201 --ingress-mode
> haproxy-protocol
> Invalid command: haproxy-protocol not in default|keepalive-only
>
> And I got this error : Invalid command haproxy-protocol
> I am using Quincy : ceph version 17.2.7 (...) quincy (stable)
>
> Is it not supported yet?
>
> Regards.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph orch doesn't execute commands and doesn't report correct status of daemons

2024-03-03 Thread Adam King
Okay, it seems like from what you're saying the RGW image itself isn't
special compared to the other ceph daemons, it's just that you want to use
the image on your local registry. In that case, I would still recommend
just using `ceph orch upgrade start --image ` with the image
from your local registry. It will transition all the ceph daemons rather
than just RGW to that image, but that's generally how cephadm expects
things to be anyway. Assuming that registry is reachable from all the nodes
on the cluster, the upgrade should be able to do it. Side note, but I have
seen some people in the past have issues with cephadm's use of repo digests
when using local registries, so if you're having issues you may want to try
setting the mgr/cephadm/use_repo_digest option to false. Just keep in mind
that means if you want to upgrade to another image, you'd have to make sure
it has a different name (usage of repo digest was added in to support
floating tags).

On Fri, Mar 1, 2024 at 11:47 AM wodel youchi  wrote:

> Hi,
>
> I'll try the 'ceph mgr fail' and report back.
>
> In the meantime, my problem with the images...
> I am trying to use my local registry to deploy the different services. I
> don't know how to use the 'apply' and force my cluster to use my local
> registry.
> So basically, what I am doing so far is :
> 1 - ceph orch apply -i rgw-service.yml   < deploy the
> rgw, and this will pull the image from the internet
> 2 - ceph orch daemon redeploy rgw.opsrgw.controllera.gtrttj --image
> 192.168.2.36:4000/ceph/ceph:v17  < Redeploy the demons of
> that service with my local image.
>
> How May I deploy directly from my local registry?
>
> Regards.
>
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io