[ceph-users] Re: Ovirt integration with Ceph

2023-04-25 Thread Konstantin Shalygin
Hi, Can you see logs at the vdsm.log file? What exactly happened on storage domain connection? k Sent from my iPhone > On 26 Apr 2023, at 00:37, kushagra.gu...@hsc.com wrote: > > Hi Team, > > We are trying to integrate ceph with ovirt. > We have deployed ovirt 4.4. > We want to create a

[ceph-users] Re: Rados gateway data-pool replacement.

2023-04-25 Thread Richard Bade
Hi Gaël, I'm actually embarking on a similar project to migrate EC pool from k=2,m=1 to k=4,m=2 using rgw multi site sync. I just thought I'd check before you do a lot of work for nothing that when you say failure domain that's the crush failure domain you mean, not k and m? If it is failure

[ceph-users] Re: Could you please explain the PG concept

2023-04-25 Thread Anthony D'Atri
Absolutely. Moreover, PGs are not a unit of size, they are a logical grouping of smaller RADOS objects, because a few thousand PGs are a lot easier and less expensive to manage than tens or hundreds of millions of small underlying RADOS objects. They’re for efficiency, and are not any set

[ceph-users] Re: Could you please explain the PG concept

2023-04-25 Thread Alex Gorbachev
Hi Wodel, The simple explanation is that PGs are a level of storage abstraction above the drives (OSD) and below objects (pools). The links below may be helpful. PGs consume resources, so they should be planned as best you can. Now you can scale them up and down, and use autoscaler, so you

[ceph-users] Could you please explain the PG concept

2023-04-25 Thread wodel youchi
Hi, I am learning Ceph and I am having a hard time understanding PG and PG calculus . I know that a PG is a collection of objects, and that PG are replicated over the hosts to respect the replication size, but... In traditional storage, we use size in Gb, Tb and so on, we create a pool from a

[ceph-users] Move ceph to new addresses and hostnames

2023-04-25 Thread Jan Marek
Hi there, I have ceph cluster created by ceph-volume - bluestore, in every node is 12 HDD and 1 NVMe, which is divided to 24 LVM partition for DB and WAL. I've turned this cluster to 'ceph orch' management, then I've moved to quincy release (now I'm using a 17.2.5 version). I had to move whole

[ceph-users] Re: Bucket sync policy

2023-04-25 Thread vitaly . goot
On Quincy: The zone-group policy has the same problem. with a similar setup. Way to reproduce: - Setup 3-way symmetrical replication on zone-group level - Switch zone-group to 'allowed' (zones are not in sync). - Add content to the detached zone(s) - Switch it back to 'enabled' (3-way

[ceph-users] Re: pacific 16.2.13 point release

2023-04-25 Thread Radoslaw Zarzynski
How about tracking the stuff in a single Etherpad? https://pad.ceph.com/p/prs-to-have-16.2.13 On Mon, Apr 24, 2023 at 9:01 PM Cory Snyder wrote: > > Hi Yuri, > > We were hoping that the following patch would make it in for 16.2.13 if > possible: > > https://github.com/ceph/ceph/pull/51200 > >

[ceph-users] Bucket sync policy

2023-04-25 Thread yjin77
Hello, ceph gurus, We are trying multisite sync policy feature with Quincy release and we encounter something strange, which we cannot solve even after combing through the internet for clues. Our test setup is very simple. I use mstart.sh to spin up 3 clusters, configure them with a single

[ceph-users] How to control omap capacity?

2023-04-25 Thread WeiGuo Ren
I have two osds. these osd are used to rgw index pool. After a lot of stress tests, these two osds were written to 99.90%. The full ratio (95%) did not take effect? I don't know much. Could it be that if the osd of omap is fully stored, it cannot be limited by the full ratio? ALSO I use

[ceph-users] Deep-scrub much slower than HDD speed

2023-04-25 Thread Niklas Hambüchen
I observed that on an otherwise idle cluster, scrubbing cannot fully utilise the speed of my HDDs. `iostat` shows only 8-10 MB/s per disk, instead of the ~100 MB/s most HDDs can easily deliver. Changing scrubbing settings does not help (see below). Environment: * 6

[ceph-users] Re: How to replace an HDD in a OSD with shared SSD for DB/WAL

2023-04-25 Thread enochlew
Than you for your suggest! I have already deleated the lvm both of the Block and DB devices. I monitored the creating process of OSD.23. with the command-line "podman ps -a". The osd.23 apeared for a shorted time, then was deleated. The feedback of the command-line

[ceph-users] Increase timeout for marking osd down

2023-04-25 Thread Nicola Mori
Dear Ceph users, my cluster is made of very old machines on a Gbit ethernet. I see that sometimes some OSDs are marked down due to slow networking, especially on heavy network load like during recovery. This causes problems, for example PGs keeps being deactivated and activated as the OSDs

[ceph-users] advise on adding RGW and NFS/iSCSI on proxmox

2023-04-25 Thread MartijnF
Hello, i'm maintaining a 3 small infrastructures in NL. 7 primary, 4 secondary and 3 testing proxmox nodes (hyperconverged). Till now we only use the embedded RBD facilities of proxmox using ceph 17.2.5 To facilitate S3/swift and persistent storage to our internal k8s cluster i am looking

[ceph-users] [sync policy] multisite bucket full sync

2023-04-25 Thread yjin77
Hi folks, We are trying multisite sync policy feature with Quincy release and we encounter something strange. Maybe our understanding of sync policy is incorrect. I hope the community could help us uncover the mystery. Our test setup is very simple. I use mstart.sh to spin up 3 clusters,

[ceph-users] Massive OMAP remediation

2023-04-25 Thread Ben . Zieglmeier
Hi All, We have a RGW cluster running Luminous (12.2.11) that has one object with an extremely large OMAP database in the index pool. Listomapkeys on the object returned 390 Million keys to start. Through bilog trim commands, we’ve whittled that down to about 360 Million. This is a bucket

[ceph-users] How to replace an HDD in a OSD with shared SSD for DB/WAL

2023-04-25 Thread enochlew
HI, I build a Ceph Cluster with cephadm. Every cehp node has 4 OSDs. These 4 OSD were build with 4 HDD (block) and 1 SDD (DB). At present , one HDD is broken, and I am trying to replace the HDD,and build the OSD with the new HDD and the free space of the SDD. I did the follows: #ceph osd stop

[ceph-users] I am unable to execute 'rbd map xxx' as it returns the error 'rbd: map failed: (5) Input/output error'.

2023-04-25 Thread siriusa51
$ rbd map xxx rbd: sysfs write failed 2023-04-21 11:29:13.786418 7fca1bfff700 -1 librbd::image::OpenRequest: failed to retrieve image id: (5) Input/output error 2023-04-21 11:29:13.786456 7fca1b7fe700 -1 librbd::ImageState: 0x55a60108a040 failed to open image: (5) Input/output error rbd: error

[ceph-users] MDS recovery

2023-04-25 Thread jack
Hi All, We have a CephFS cluster running Octopus with three control nodes each running an MDS, Monitor, and Manager on Ubuntu 20.04. The OS drive on one of these nodes failed recently and we had to do a fresh install, but made the mistake of installing Ubuntu 22.04 where Octopus is not

[ceph-users] Ovirt integration with Ceph

2023-04-25 Thread kushagra . gupta
Hi Team, We are trying to integrate ceph with ovirt. We have deployed ovirt 4.4. We want to create a storage domain of POSIX compliant type for mounting a ceph based infrastructure in ovirt. We have done SRV based resolution in our DNS server for ceph mon nodes but we are unable to create a

[ceph-users] Re: Consequence of maintaining hundreds of clones of a single RBD image snapshot

2023-04-25 Thread Perspecti Vus
Hi again, Is there a limit/best-practice regarding number of clones? I'd like to start development, but want to make sure I won't run into scaling issues. Perspectivus ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] How to find the bucket name from Radosgw log?

2023-04-25 Thread viplanghe6
I find a log like this, and I thought the bucket name should be "photos": [2023-04-19 15:48:47.0.5541s] "GET /photos/shares/ But I can not find it: radosgw-admin bucket stats --bucket photos failure: 2023-04-19 15:48:53.969 7f69dce49a80 0 could not get bucket info for bucket=photos

[ceph-users] Cannot add disks back after their OSDs were drained and removed from a cluster

2023-04-25 Thread stillsmil
I find that I cannot re-add a disk to a Ceph cluster after the OSD on the disk is removed. Ceph seems to know about the existence of these disks, but not about their "host:dev" information: ``` # ceph device ls DEVICE HOST:DEV DAEMONS WEAR LIFE

[ceph-users] Dead node (watcher) won't timeout on RBD

2023-04-25 Thread max
Hey all, I recently had a k8s node failure in my homelab, and even though I powered it off (and it's done for, so it won't get back up), it still shows up as watcher in rbd status. ``` root@node0:~# rbd status kubernetes/csi-vol-3e7af8ae-ceb6-4c94-8435-2f8dc29b313b Watchers:

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread Marc
Maybe he is limited by the supported OS > > I would create a new cluster with Quincy and would migrate the data from > the old to the new cluster bucket by bucket. Nautilus is out of support > and > I would recommend at least to use a ceph version that is receiving > Backports. > >

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread Joachim Kraftmayer
I would create a new cluster with Quincy and would migrate the data from the old to the new cluster bucket by bucket. Nautilus is out of support and I would recommend at least to use a ceph version that is receiving Backports. huxia...@horebdata.cn schrieb am Di., 25. Apr. 2023, 18:30: > Dear

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread huxia...@horebdata.cn
Thanks a lot for the valuable input, Wesley, Josh, and Anthony. It seems the best practice would be upgrade first, and then expand, remove old nodes afterwards. best regards, Samuel huxia...@horebdata.cn From: Wesley Dillingham Date: 2023-04-25 19:55 To: huxia...@horebdata.cn CC:

[ceph-users] Lua scripting in the rados gateway

2023-04-25 Thread Thomas Bennett
Hi ceph users, I've been trying out the lua scripting for the rados gateway (thanks Yuval). As in my previous email I mentioned that there is an error when trying to load the luasocket module. However, I thought it was a good time to report on my progress. My 'hello world' example below is

[ceph-users] PVE CEPH OSD heartbeat show

2023-04-25 Thread Peter
Dear all, We are experiencing with Ceph after deploying it by PVE with the network backed by a 10G Cisco switch with VPC feature on. We are encountering a slow OSD heartbeat and have not been able to identify any network traffic issues. Upon checking, we found that the ping is around 0.1ms,

[ceph-users] Reset a bucket in a zone

2023-04-25 Thread Yixin Jin
Hi folks, Within a zonegroup, once a bucket is created, its metadata is sync-ed over to all zones. With bucket-level sync policy, however, its content may or may not be sync-ed over. To simplify the sync process, sometime I'd like to pick the bucket in a zone as the absolute truth and sync its

[ceph-users] Rados gateway lua script-package error lib64

2023-04-25 Thread Thomas Bennett
Hi, I've noticed that when my lua script runs I get the following error on my radosgw container. It looks like the lib64 directory is not included in the path when looking for shared libraries. Copying the content of lib64 into the lib directory solves the issue on the running container. Here

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread Wesley Dillingham
Get on nautilus first and (perhaps even go to pacific) before expansion. Primarily for the reason that starting in nautilus degraded data recovery will be prioritized over remapped data recovery. As you phase out old hardware and phase in new hardware you will have a very large amount of backfill

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread Josh Baergen
Hi Samuel, While the second method would probably work fine in the happy path, if something goes wrong I think you'll be happier having a uniform release installed. In general, we've found the backfill experience to be better on Nautilus than Luminous, so my vote would be for the first method.

[ceph-users] For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread huxia...@horebdata.cn
Dear Ceph folks, I would like to listen to your advice on the following topic: We have a 6-node Ceph cluster (for RGW usage only ) running on Luminous 12.2.12, and now will add 10 new nodes. Our plan is to phase out the old 6 nodes, and run RGW Ceph cluster with the new 10 nodes on Nautilus

[ceph-users] Bucket notification

2023-04-25 Thread Szabo, Istvan (Agoda)
Hi, I'm trying to set a kafka endpoint for bucket object create operation notifications but the notification is not created in kafka endpoint. Settings seems to be fine because I can upload to the bucket objects when these settings are applied: NotificationConfiguration> bulknotif

[ceph-users] Re: cephadm grafana per host certificate

2023-04-25 Thread Eugen Block
Seems like the per-host config was actually introduced in 16.2.11: https://github.com/ceph/ceph/pull/48103 So I'm gonna have to wait for 16.2.13. Sorry for the noise. Zitat von Eugen Block : I looked a bit deeper and compared to a similar customer cluster (16.2.11) where I had to reconfigure

[ceph-users] Re: cephadm grafana per host certificate

2023-04-25 Thread Eugen Block
I looked a bit deeper and compared to a similar customer cluster (16.2.11) where I had to reconfigure grafana after an upgrade anyway. There it seems to work as expected with the per-host certificate. I only added the host-specific certs and keys and see the graphs in the dashboard while

[ceph-users] Re: Veeam backups to radosgw seem to be very slow

2023-04-25 Thread Anthony D'Atri
>> >> >> We have a customer that tries to use veeam with our rgw objectstorage and >> it seems to be blazingly slow. >> What also seems to be strange, that veeam sometimes show "bucket does not >> exist" or "permission denied". >> I've tested parallel and everything seems to work fine from

[ceph-users] Re: Veeam backups to radosgw seem to be very slow

2023-04-25 Thread Ulrich Klein
Hi, I’ve tested that combination once last year. My experience was similar. It was dead-slow. But if I remember correctly my conclusion was that Veeam was sending very slowly lots of rather small objects without any parallelism. But apart from the cruel slowness I didn’t have problems of the

[ceph-users] Re: Veeam backups to radosgw seem to be very slow

2023-04-25 Thread Janne Johansson
Den tis 25 apr. 2023 kl 15:02 skrev Boris Behrens : > > We have a customer that tries to use veeam with our rgw objectstorage and > it seems to be blazingly slow. > What also seems to be strange, that veeam sometimes show "bucket does not > exist" or "permission denied". > I've tested parallel and

[ceph-users] Veeam backups to radosgw seem to be very slow

2023-04-25 Thread Boris Behrens
We have a customer that tries to use veeam with our rgw objectstorage and it seems to be blazingly slow. What also seems to be strange, that veeam sometimes show "bucket does not exist" or "permission denied". I've tested parallel and everything seems to work fine from the s3cmd/aws cli

[ceph-users] Re: Bucket sync policy

2023-04-25 Thread Soumya Koduri
Hi Yixin, On 4/25/23 00:21, Yixin Jin wrote: Actually, "bucket sync run" somehow made it worse since now the destination zone shows "bucket is caught up with source" from "bucket sync status" even though it clearly missed an object. On Monday, April 24, 2023 at 02:37:46 p.m. EDT,

[ceph-users] Re: ceph pg stuck - missing on 1 osd how to proceed

2023-04-25 Thread xadhoom76
Hi, the system is still in backfilling and still have the same pg in degraded. I see that % of degraded object is in still. I mean it never decrease belove 0.010% from days. Is the backfilling connected to the degraded ? System must finish backfilling before finishing the degraded one ? [WRN]