[ceph-users] Re: Building new cluster had a couple of questions

2023-12-21 Thread Simon Ironside
On 21/12/2023 13:50, Drew Weaver wrote: Howdy, I am going to be replacing an old cluster pretty soon and I am looking for a few suggestions. #1 cephadm or ceph-ansible for management? #2 Since the whole... CentOS thing... what distro appears to be the most straightforward to use with Ceph?

[ceph-users] Re: v18.2.1 Reef released

2023-12-20 Thread Simon Ironside
Hi All, We're deploying a fresh Reef cluster now and noticed that cephadm bootstrap deploys 18.2.0 and not 18.2.1. It appears this is because the v18, v18.2 (and v18.2.0) tags are all pointing to the v18.2.0-20231212 tag since 16th December here:

[ceph-users] Re: RBD Disk Usage

2023-08-07 Thread Simon Ironside
When you delete files they're not normally scrubbed from the disk, the file system just forgets the deleted files are there. To fully remove the data you need something like TRIM: fstrim -v /the_file_system Simon On 07/08/2023 15:15, mahnoosh shahidi wrote: Hi all, I have an rbd image that

[ceph-users] Re: One PG keeps going inconsistent (stat mismatch)

2021-10-11 Thread Simon Ironside
the primary) that serve this PG to 0 to try to force its recreation. Thanks, Simon. On 22/09/2021 18:50, Simon Ironside wrote: Hi All, I have a recurring single PG that keeps going inconsistent. A scrub is enough to pick up the problem. The primary OSD log shows something like: 2021-09-22 18:08

[ceph-users] One PG keeps going inconsistent (stat mismatch)

2021-09-22 Thread Simon Ironside
Hi All, I have a recurring single PG that keeps going inconsistent. A scrub is enough to pick up the problem. The primary OSD log shows something like: 2021-09-22 18:08:18.502 7f5bdcb11700 0 log_channel(cluster) log [DBG] : 1.3ff scrub starts 2021-09-22 18:08:18.880 7f5bdcb11700 -1

[ceph-users] Re: Worst thing that can happen if I have size= 2

2021-02-05 Thread Simon Ironside
On 05/02/2021 20:10, Mario Giammarco wrote: It is not that a morning I wake up and put some random hardware together, I followed guidelines. The result should be: - if a disk (or more) brokes work goes on - if a server brokes the VMs on the server start on another server and work goes on. The

[ceph-users] Re: Worst thing that can happen if I have size= 2

2021-02-03 Thread Simon Ironside
On 03/02/2021 19:48, Mario Giammarco wrote: It is obvious and a bit paranoid because many servers on many customers run on raid1 and so you are saying: yeah you have two copies of the data but you can broke both. Consider that in ceph recovery is automatic, with raid1 some one must manually

[ceph-users] Re: Worst thing that can happen if I have size= 2

2021-02-03 Thread Simon Ironside
On 03/02/2021 09:24, Mario Giammarco wrote: Hello, Imagine this situation: - 3 servers with ceph - a pool with size 2 min 1 I know perfectly the size 3 and min 2 is better. I would like to know what is the worst thing that can happen: Hi Mario, This thread is worth a read, it's an oldie but

[ceph-users] Re: Are there 'tuned profiles' for various ceph scenarios?

2020-07-01 Thread Simon Ironside
Here's an example for SCSI disks (the main benefit vs VirtIO is discard/unmap/TRIM support): discard='unmap'/> You also need a VirtIO-SCSI controller to use these, which will look something like: function='0x0'/> Cheers, Simon. On 01/07/2020

[ceph-users] Re: Possible bug with rbd export/import?

2020-03-10 Thread Simon Ironside
On 10/03/2020 19:31, Matt Dunavant wrote: We're using rbd images for VM drives both with and without custom stripe sizes. When we try to export/import the drive to another ceph cluster, the VM always comes up in a busted state it can't recover from. Don't shoot me for asking but is the VM

[ceph-users] Re: moving small production cluster to different datacenter

2020-01-28 Thread Simon Ironside
And us too, exactly as below. One at a time then wait for things to recover before moving the next host. We didn't have any issues with this approach either. Regards, Simon. On 28/01/2020 13:03, Tobias Urdin wrote: We did this as well, pretty much the same as Wido. We had a fiber connection

[ceph-users] Re: v14.2.5 Nautilus released

2019-12-10 Thread Simon Ironside
Thanks all! On 10/12/2019 09:45, Abhishek Lekshmanan wrote: This is the fifth release of the Ceph Nautilus release series. Among the many notable changes, this release fixes a critical BlueStore bug that was introduced in 14.2.3. All Nautilus users are advised to upgrade to this release. For

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-12-02 Thread Simon Ironside
Any word on 14.2.5? Nervously waiting here . . . Thanks, Simon. On 18/11/2019 11:29, Simon Ironside wrote: I will sit tight and wait for 14.2.5. Thanks again, Simon. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-11-18 Thread Simon Ironside
Hi Igor, Thanks very much for providing all this detail. On 18/11/2019 10:43, Igor Fedotov wrote: - Check how full their DB devices are? For your case it makes sense to check this. And then safely wait for 14.2.5 if its not full. bluefs.db_used_bytes / bluefs_db_total_bytes is only around

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-11-15 Thread Simon Ironside
Hi Igor, On 15/11/2019 14:22, Igor Fedotov wrote: Do you mean both standalone DB and(!!) standalone WAL devices/partitions by having SSD DB/WAL? No, 1x combined DB/WAL partition on an SSD and 1x data partition on an HDD per OSD. I.e. created like: ceph-deploy osd create --data /dev/sda

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-11-15 Thread Simon Ironside
Hi, I have two new-ish 14.2.4 clusters that began life on 14.2.0 , all with HDD OSDs with SSD DB/WALs but neither have experienced obvious problems yet. What's the impact of this? Does possible data corruption mean possible silent data corruption? Or does the corruption cause the OSD

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Simon Ironside
Hi, My three-node lab cluster is similar to yours but with 3x bluestore OSDs per node (4TB SATA spinning disks) and 1x shared DB/WAL (240GB SATA SSD) device per node. I'm only using gigabit networking (one interface public, one interface cluster) also ceph 14.2.4 with 3x replicas. I would