[ceph-users] Connecting to multiple filesystems from kubernetes

2022-05-23 Thread Sigurd Kristian Brinch
Hi,

I'm running a pacific ceph cluster and would like to connect to multiple cephfs 
filesystems from kubernetes.
As far as I am able to figure out, there is no field for this in the k8s API 
(https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#cephfsvolumesource-v1-core)
 to specify which filesystem to use.

How do I specify which filesystem to connect to? 

BR
Sigurd
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

2022-05-23 Thread ronny.lippold

hi arthur,

just for information. we had some horrible days ...

last week, we shut some virtual machines down.
most of them did not came back. timeout qmp socket ... and no kvm 
console.


so, we switched to our rbd-mirror cluster and ... yes, was working, puh.

some days later, we tried to install a devel proxmox package, which 
should help.
did not ... helpfull was, to rbd move the image and than move back (like 
rename).


today, i found the answer.

i cleaned up the pool config and we removed the journaling feature from 
the images.

after that, everything was booting fine.

maybe the performance issue with snapshots came from an proxmox bug ... 
we will see

(https://forum.proxmox.com/threads/possible-bug-after-upgrading-to-7-2-vm-freeze-if-backing-up-large-disks.109272/)

have a great time ...

ronny

Am 2022-05-12 15:29, schrieb Arthur Outhenin-Chalandre:

On 5/12/22 14:31, ronny.lippold wrote:

many thanks, we will check the slides ... are looking great




ok, you mean, that the growing came, cause of replication is to 
slow?

strange ... i thought our cluster is not so big ... but ok.
so, we cannot use journal ...
maybe some else have same result?


If you want a bit more details on this you can check my slides here:
https://codimd.web.cern.ch/p/-qWD2Y0S9#/.


Hmmm I think there are some plan to have a way to spread the 
snapshots
in the provided interval in Reef (and not take every snapshots at 
once)

but that's unfortunately not here today... The timing thing is a bit
weird but I am not an expert on RBD snapshots implication in 
general...

Maybe you can try to reproduce by taking snapshot by hand with `rbd
mirror image snapshot` on some of your images, maybe that's something
related to really big images? Or that there was a lot of write since
the
last snapshot?



yes right, i was alos thinking of this ...
i would like to find something, to debug the problem.
problems after 50days ... i do not understand this

which way are you actually going? do you have a replication?


We are going towards mirror snapshots, but we didn't advertise
internally so far and we won't enable it on every images; it would only
be for new volumes if people want explicitly that feature. So we are
probably not going to hit these performance issues that you suffer for
quite some time and the scope of it should be limited...

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 3-node Ceph with DAS storage and multipath

2022-05-23 Thread Kostadin Bukov
Hello Frank,
Thanks for the feedback!

> Ceph doesn't, but LVM does. This is one of the reasons for moving to LVM, to 
> allow transparent use of multi-path devices via LVM instead of replicating 
> native support in ceph. Now, the remaining question is, how to tell ceph-adm 
> to use only the multipath devices created by the multipath service, if you 
> want to use ceph-adm (I don't). I don't remember where they show up, is it 
> /dev/multipath or something?
>

For a shared storage we use multipathd (multipath daemon) and configure two raw 
devices for example /dev/sda (this is the block device 1 visible via Path A) 
and dev/sdb (this is the block device 1 visible via Path B) into a multipath 
device with name for example diskmpath1. When we make the multipath 
configuration will appear device '/dev/mapper/diskmpath1', and on top of this 
'/dev/mapper/diskmpath1' we create physical volume for example.

[root@comp1 ~]# pvcreate /dev/mapper/diskmpath1 
  Physical volume "/dev/mapper/diskmpath1" successfully created

So I'm to find out if this multipath is also correct to be used for Ceph 
deployment.
Once I figure out how to proceed with multipath devices my plan is to install 
Ceph using official install method:
- Cephadm installs and manages a Ceph cluster using containers and systemd, 
with tight integration with the CLI and dashboard GUI.

Regards,
Kosta 

On Monday, 23 May 2022 12:31:15 pm (+03:00), Frank Schilder wrote:

> > > My question is shall I configure multipath from RHEL 8.6 OS in advance 
> > > (for
> > > example sda+sdbb=md0) or I should leave cephadm to handle the multipath by
> > > itself?
> >
> > I don’t think Ceph has any concept of multipathing. Since Ceph keeps 
> > multiple copies of data, it’s usually overkill.
>
> Ceph doesn't, but LVM does. This is one of the reasons for moving to LVM, to 
> allow transparent use of multi-path devices via LVM instead of replicating 
> native support in ceph. Now, the remaining question is, how to tell ceph-adm 
> to use only the multipath devices created by the multipath service, if you 
> want to use ceph-adm (I don't). I don't remember where they show up, is it 
> /dev/multipath or something?
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] orphaned journal_data objects on pool after disabling rbd mirror

2022-05-23 Thread Denis Polom

Hi

after disabling journaling feature on images and disabling rbd-mirror on 
pool, there are still a lot of journal_data objects on pool.


Is it safe to remove these objects manually from the pool?

Thanks

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Dashboard: SSL error in the Object gateway menu only

2022-05-23 Thread E Taka
Thanks, Eugen, for pointing that out. The commands for setting some RGW
options disappeared in 16.2.6.
I'm pretty sure that none of the admins used the IP address instead of a
DNS name. We'll  try it with the orch commands.

Am So., 22. Mai 2022 um 10:52 Uhr schrieb Eugen Block :

> Hi,
> in earlier versions (e.g. Nautilus) there was a dashboard command to
> set the RGW hostname, that is not available in Octopus (I didn’t check
> Pacific, probably when cephadm took over), so I would assume that it
> comes from the ‘ceph orch host add’ command and you probably used the
> host’s IP for that? But I’m just guessing here. Not sure if there’s a
> way to reset it with a ceph command or if removing and adding the host
> again would be necessary. There was a thread about that just this week.
>
>
> Zitat von E Taka <0eta...@gmail.com>:
>
> > Version: Pacific 16.2.9
> >
> > Hi,
> >
> > when clicking in the Dashboard at "Object Gateway" submenus, for example
> > "Daemons", the Dashboard gets an HTTP error 500. The logs says about
> this:
> >
> > requests.exceptions.SSLError: HTTPSConnectionPool(host='10.149.12.179',
> > port=8000): Max retries exceeded with url: /admin/metadata/user?myself
> > (Caused by SSLError(CertificateError("hostname '10.149.12.179' doesn't
> > match either of […hostnames…] […]
> >
> > We applied a correct rgw_frontend_ssl_certificate with a FQDN.
> > Obviously the error shows that the Dashboard should use the FQDN instead
> of
> > the correct IP address '10.149.12.179'. But how can I change it?
> >
> > (Yes, there is the workaround  "ceph dashboard set-rgw-api-ssl-verify
> > False", which I try to avoid).
> >
> > Thanks
> > Erich
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 3-node Ceph with DAS storage and multipath

2022-05-23 Thread Kostadin Bukov
Hello,


@Anthony


> This helps. So this is a unique type of blade / CI chassis. You describe it 
> as a PoC: would you use similar hardware for production? That Chassis/Frame 
> could have a large blast radius. One of the great things about Ceph of course 
> is that it’s adaptable to a wide variety of hardware, but there are some 
> caveats:
>


Correct this is the next generation of blade chassis provided by HP called 
Synergy. This is the successor of the old blade chassis (C7000) and blade 
servers which are already discontinued. Yes, our intention is to use similar 
(if not the same setup for production platforms). This is why we are trying to 
configure Ceph 'the proper way' and according to best practices.


> * When using dense hardware that packs a lot into a single chassis, consider 
> what happens when that chassis smokes, is down for maintenance, etc.
> * Large / deep chassis can pull a lot of power, so evaluate your production 
> configuration and the KW available to your racks / PDUs. It is not uncommon 
> for racks with large / dense chassis to only be half filled because of 
> available power limitations, or the weight capacity of a raised floor. I’ve 
> even seen DCs with strict policies that all racks must have front and rear 
> doors, and sometimes deep chassis prevent doors from closing unless the racks 
> are extra deep themselves.
>


You are absolutely right! The Synergy frame (chassis) is with redundant 
components (power supplies, fans, inter-connect modules, uplinks, etc...). So 
when there is some maintenance on the frame (like firmware update for example) 
it is done one by one to avoid outages  We are taking into consideration the 
needed power KW for the frame and install needed PDUs (from power source A and 
B for redundancy). Same for the rack static and dynamic load. Our goal is to 
have 1 synergy frame (10U) and in rear cases max 2 synergy frames in one rack 
cabinet.

> Something like a DL360 or DL380 is common for Ceph. I’m happy to discuss 
> infrastructure off-list if you like.
>


We are using HP DL380 G10 server as well for our bare metal of VM infra 
platforms. One of the advantages for Synergy frame is the you have all-in-one 
10U box (servers, DAS storage, network modules, power supply, etc...) and this 
save space in the rack cabinet compared with the rack mountable DL380 G10 
servers. The other advantage is that you can expand it pretty easy without 
laying power cables, network cables, etc.., just plug-in the next synergy 
compute module. This is why are a focused on HPE Synergy frame currently.


> * Don’t spend extra for multipathing, and disable it on your PoC if you can 
> for simplicity


All the SSD drives in the DAS storage are accessible via two modules I/O 
adapter 1 and I/O adapter 2. The easiest for me is to pull out one of the I/O 
adapters of the DAS D3940 stroage and leave only one path or to remove one of 
the SAS inter-connect modules. My concern is that single I/O adapter inside DAS 
will be single point of failure, which means that if this I/O module fails for 
some reason all compute modules in the frame will lose connectivity to the SSD 
drives in DAS and will cause full outage. So this is why are trying to setup 
highly-available and redundant configuration without single point of failure.

> * Consider larger SSDs, 3.84TB or 7.6TB, but in a production deployment you 
> also need to consider the discrete number of drives for fault tolerance, 
> balancing, and software bottlenecks.


Thanks, I will check larger SSD drives.  Usually we are sizing the space as per 
specific requirements which could change from installation to installation.

> * Seriously consider NVMe instead of SAS for a new design. With a judicious 
> design you might be surprised at how competitive the cost can be. SAS market 
> share is progressively declining and the choice available / new drives will 
> continue to shrink
>


The reason we selected SAS is because they were cheaper compared with NVMe (HP 
prices could be pretty high sometime), but I will check NVMe drives prices to 
compare them again.


> Also, “mixed use” probably means 3DWPD-class, for Ceph purposes I’ve 
> personally seen 1DWPD-class drives be plenty. ymmv depending on your use-case 
> and intended lifetime


'Mixed use' means that the SSD drive is suitable for writing and for reading 
which is a balance between both. Our application are quite write intensive so 
our main goal is to have resilient SSD instead of slow but reliable HDDs.






@Javier,


> Kosta, ¿Can you manage dual paths to disk through multipathd?
>


Yes I can configure the DAS drives with the multipath from RHEL 8.6 OS.
I'm still not doing this because I'm not sure if this is the 'proper' way for 
Ceph storage.


Regrads,
Kosta  




On Saturday, 21 May 2022 8:45:18 pm (+03:00), Anthony D'Atri wrote:

>
> This helps. So this is a unique type of blade / CI chassis. You describe it 
> as a PoC: would you use similar hardware for