[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-23 Thread Xiubo Li
On 23/11/2022 19:49, Adrien Georget wrote: Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy before the upgrade, everything was done according to the upgrade procedure (non-cephadm) [1], all services have restarted correctly but the

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-23 Thread Xiubo Li
Hi Adrien, On 23/11/2022 19:49, Adrien Georget wrote: Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy before the upgrade, everything was done according to the upgrade procedure (non-cephadm) [1], all services have restarted correctly

[ceph-users] Re: radosgw-admin bucket check --fix returns a lot of errors (unable to find head object data)

2022-11-23 Thread Boris Behrens
Hi, I was able to clean up the objects by hand. I leave my breadcrumbs here in case someone finds it useful. 1. Get all rados objects via `radosgw-admin bucket radoslist --bucket $BUCKET` and filter the ones the you need to remove 2. Remove the rados objects via `rados -p $RGWDATAPOOL rm

[ceph-users] Re: CephFS performance

2022-11-23 Thread Robert W. Eckert
Have you tested having the block.db and WAL for each OSD on a faster SSD/NVME device/ partition? I have a bit smaller environment, but was able to take a 2 Tb SSD, split it into 4 partitions and use it for the db and WAL for the 4 Drives. By Default if you move the block.db to a different

[ceph-users] Re: Issues during Nautilus Pacific upgrade

2022-11-23 Thread Marc
> We would like to share our experience upgrading one of our clusters from > Nautilus (14.2.22-1bionic) to Pacific (16.2.10-1bionic) a few weeks ago. > To start with, we had to convert our monitors databases to rockdb in Weirdly I have just one monitor db in leveldb still. Is still recommend

[ceph-users] Ceph Leadership Team Meeting 11-23-2022

2022-11-23 Thread Ernesto Puerta
Hi Cephers, Short meeting today: - The Sepia Lab is gradually coming back to life! Dan Mick & others managed to restore the testing environment (some issues could still remain, please ping Dan if you experience any). - Thanks to that, the release process for the Pacific 16.2.11

[ceph-users] Re: failure resharding radosgw bucket

2022-11-23 Thread Casey Bodley
hi Jan, On Wed, Nov 23, 2022 at 12:45 PM Jan Horstmann wrote: > > Hi list, > I am completely lost trying to reshard a radosgw bucket which fails > with the error: > > process_single_logshard: Error during resharding bucket > 68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs:(2) No such file or >

[ceph-users] Re: *****SPAM***** Re: CephFS performance

2022-11-23 Thread Marc
> crashes). In the case of BeeGFS, if there is a problem on any machine, the > whole cluster becomes inconsistent (at the point of my tests, I'm not working > with that). > But the first question you should ask yourself is, can you afford to be having these down hours, or do you want to have

[ceph-users] failure resharding radosgw bucket

2022-11-23 Thread Jan Horstmann
Hi list, I am completely lost trying to reshard a radosgw bucket which fails with the error: process_single_logshard: Error during resharding bucket 68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs:(2) No such file or directory But let me start from the beginning. We are running a ceph cluster

[ceph-users] Re: CephFS performance

2022-11-23 Thread quaglio
Hi Gregory, Thanks for your reply! We are evaluating possibilities to increase storage performance. I understand that Ceph has has better capability in data resiliency. This has been one of the arguments I use to keep this tool in our storage. I say this mainly in failure

[ceph-users] Re: CephFS performance

2022-11-23 Thread quag...@bol.com.br
Hi David, First of all, thanks for your reply! The resiliency of BeeGFS is in doing RAID on disks (by hardware or software) at the same node as the storage. If there is a need for greater resilience, the maximum possible is through buddy (which would be another storage machine as a fail

[ceph-users] Re: Multi site alternative

2022-11-23 Thread Matthew Leonard (BLOOMBERG/ 120 PARK)
Hey Ivan, I think the answer would be multisite. I know there is a lot of effort currently to work out the last few kinks. This tracker might be of interest as it sounds like an already identified issue, https://tracker.ceph.com/issues/57562#change-228263 Matt From: istvan.sz...@agoda.com

[ceph-users] Re: hw failure, osd lost, stale+active+clean, pool size 1, recreate lost pgs?

2022-11-23 Thread Clyso GmbH - Ceph Foundation Member
Hi Jelle, did you try: ceph osd force-create-pg https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-pg/#pool-size-1 Regards, Joachim ___ Clyso GmbH - Ceph Foundation Member Am 22.11.22 um 11:33 schrieb Jelle de Jong: Hello everybody,

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-23 Thread Adrien Georget
This bug looks very similar to this issue opened last year and closed without any solution : https://tracker.ceph.com/issues/52260 Adrien Le 23/11/2022 à 12:49, Adrien Georget a écrit : Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy

[ceph-users] Multi site alternative

2022-11-23 Thread Szabo, Istvan (Agoda)
Hi, Due to the lack of documentation and issues with multisite bucket sync I’m looking for an alternative solution where I can put some sla around the sync like I can guarantee that the file will be available in x minutes. Which solution you guys are using which works fine with huge amount of

[ceph-users] Issues during Nautilus Pacific upgrade

2022-11-23 Thread Ana Aviles
Hi, We would like to share our experience upgrading one of our clusters from Nautilus (14.2.22-1bionic) to Pacific (16.2.10-1bionic) a few weeks ago. To start with, we had to convert our monitors databases to rockdb in order to continue with the upgrade. Also, we had to migrate all our OSDs to

[ceph-users] Requesting recommendations for Ceph multi-cluster management

2022-11-23 Thread Thomas Eckert
I'm looking for guidance/recommendations on how to approach below topic. As I'm fairly new to Ceph as a whole, I might be using or looking for terms/solutions incorrectly or simply missing some obvious puzzle pieces. Please do not assume advanced Ceph knowledge on my part (-: We are looking at

[ceph-users] Re: ceph-volume lvm zap destroyes up+in OSD

2022-11-23 Thread Eugen Block
I can confirm the same behavior for all-in-one OSDs, it starts to wipe then aborts, but the OSD can't be restarted. I'll create a tracker issue, maybe not today though. Zitat von Frank Schilder : Hi Eugen, can you confirm that the silent corruption happens also on a collocated OSDc

[ceph-users] RGW Forcing buckets to be encrypted (SSE-S3) by default (via a global bucket encryption policy)?

2022-11-23 Thread Christian Rohmann
Hey ceph-users, loosely related to my question about client-side encryption in the Cloud Sync module (https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/I366AIAGWGXG3YQZXP6GDQT4ZX2Y6BXM/) I am wondering if there are other options to ensure data is encrypted at rest and also

[ceph-users] filesystem became read only after Quincy upgrade

2022-11-23 Thread Adrien Georget
Hi, We upgraded this morning a Pacific Ceph cluster to the last Quincy version. The cluster was healthy before the upgrade, everything was done according to the upgrade procedure (non-cephadm) [1], all services have restarted correctly but the filesystem switched to read only mode when it

[ceph-users] radosgw-admin bucket check --fix returns a lot of errors (unable to find head object data)

2022-11-23 Thread Boris Behrens
Hi, we have a customer that got some _multipart_ files in his bucket, but the bucket got no unfinished multipart objects. So I tried to remove them via $ radosgw-admin object rm --bucket BUCKET --object=_multipart_OBJECT.qjqyT8bXiWW5jdbxpVqHxXnLWOG3koUi.1 ERROR: object remove returned: (2) No

[ceph-users] Re: ceph-volume lvm zap destroyes up+in OSD

2022-11-23 Thread Frank Schilder
Hi Eugen, can you confirm that the silent corruption happens also on a collocated OSDc (everything on the same device) on pacific? The zap command should simply exit with "osd not down+out" or at least not do anything. If this accidentally destructive behaviour is still present, I think it is

[ceph-users] Re: *****SPAM***** Re: CephFS performance

2022-11-23 Thread Marc
> > That said, if you've been happy using CephFS with hard drives and > gigabit ethernet, it will be much faster if you store the metadata on > SSD and can increase the size of the MDS cache in memory Is using multiple adapters already being supported? That seems wanted using 1Gbit.

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-23 Thread Frank Schilder
Hi Patrick and everybody, I wrote a small script that pins the immediate children of 3 sub-dirs on our file system in a round-robin way to our 8 active ranks. I think the experience is worth reporting here. In any case, Patrick, if you can help me get distributed ephemeral pinning to work,

[ceph-users] Re: ceph-volume lvm zap destroyes up+in OSD

2022-11-23 Thread Eugen Block
Hi, I can confirm the behavior for Pacific version 16.2.7. I checked with a Nautilus test cluster and there it seems to work as expected. I tried to zap a db device and then restarted one of the OSDs, successfully. So there seems to be a regression somewhere. I didn't search for tracker