[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"
Den mån 20 sep. 2021 kl 18:02 skrev Dave Piper : > Okay - I've finally got full debug logs from the flapping OSDs. The raw logs > are both 100M each - I can email them directly if necessary. (Igor I've > already sent these your way.) > Both flapping OSDs are reporting the same "bluefs _allocate failed to > allocate" errors as before. I've also noticed additional errors about > corrupt blocks which I haven't noticed previously. E.g. > 2021-09-08T10:42:13.316+ 7f705c4f2f00 3 rocksdb: > [table/block_based_table_reader.cc:1117] Encountered error while reading data > from compression dictionary block Corruption: block checksum mismatch: > expected 0, got 2324967111 in db/501397.sst offset 18446744073709551615 size > 18446744073709551615 Those 18446744073709551615 numbers are -1 (or the largest 64bit int), so something makes the numbers wrap around below zero. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
Hi, Some further investigation on the failed OSDs: 1 out of 8 OSDs actually has hardware issue, [16841006.029332] sd 0:0:10:0: [sdj] tag#96 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=2s [16841006.037917] sd 0:0:10:0: [sdj] tag#34 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK cmd_age=2s [16841006.047558] sd 0:0:10:0: [sdj] tag#96 Sense Key : Medium Error [current] [16841006.057647] sd 0:0:10:0: [sdj] tag#34 CDB: Read(16) 88 00 00 00 00 00 00 07 e7 70 00 00 00 10 00 00 [16841006.064693] sd 0:0:10:0: [sdj] tag#96 Add. Sense: Unrecovered read error [16841006.073988] blk_update_request: I/O error, dev sdj, sector 518000 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [16841006.080949] sd 0:0:10:0: [sdj] tag#96 CDB: Read(16) 88 00 00 00 00 00 0b 95 d9 80 00 00 00 08 00 00 smartctl: Error 23 occurred at disk power-on lifetime: 6105 hours (254 days + 9 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 80 d9 95 0b Error: UNC at LBA = 0x0b95d980 = 194369920 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- 60 00 10 70 e7 07 40 00 14d+02:46:05.704 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 14d+02:46:05.703 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 14d+02:46:05.703 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 14d+02:46:05.703 READ FPDMA QUEUED 60 00 08 ff ff ff 4f 00 14d+02:46:05.703 READ FPDMA QUEUED so, let's say, this might be hw fault, though the drive appears to be working fine. But the other 7 show no hw related issues. The HDDs are Seagate Exos X16, enterprise grade, servers are supermicro SSG-6029P-E1CR24L-AT059 with ECC. There are no cpu or memory errors logged in the past months on the servers, which have been up for ~200 days. So it's is unlikely HW fault. Is there something else that could be checked? I have left one OSD intact, so it can be checked further. Best regards, Andrej On 20/09/2021 17:09, Neha Ojha wrote: Can we please create a bluestore tracker issue for this (if one does not exist already), where we can start capturing all the relevant information needed to debug this? Given that this has been encountered in previous 16.2.* versions, it doesn't sound like a regression in 16.2.6 to me, rather an issue in pacific. In any case, we'll prioritize fixing it. Thanks, Neha On Mon, Sep 20, 2021 at 8:03 AM Andrej Filipcic wrote: On 20/09/2021 16:02, David Orman wrote: Same question here, for clarity, was this on upgrading to 16.2.6 from 16.2.5? Or upgrading from some other release? from 16.2.5. but the OSD services were never restarted after upgrade to .5, so it could be a leftover of previous issues. Cheers, Andrej On Mon, Sep 20, 2021 at 8:57 AM Sean wrote: I also ran into this with v16. In my case, trying to run a repair totally exhausted the RAM on the box, and was unable to complete. After removing/recreating the OSD, I did notice that it has a drastically smaller OMAP size than the other OSDs. I don’t know if that actually means anything, but just wanted to mention it in case it does. ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL%USE VAR PGS STATUS TYPE NAME 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB 7.7 TiB 29.81 1.02 34 uposd.14 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB 7.6 TiB 30.03 1.03 35 uposd.16 ~ Sean On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: I got the exact same error on one of my OSDs when upgrading to 16. I used it as an exercise on trying to fix a corrupt rocksdb. A spent a few days of poking with no success. I got mostly tool crashes like you are seeing with no forward progress. I eventually just gave up, purged the OSD, did a smart long test on the drive to be sure and then threw it back in the mix. Been HEALTH OK for a week now after it finished refilling the drive. On 9/19/21 10:47 AM, Andrej Filipcic wrote: 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background compaction error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032, Accumulated background error counts: 1 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 Rocksdb transaction: ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___
[ceph-users] RocksDB options for HDD, SSD, NVME Mixed productions
Hello everyone! I want to understand the concept and tune my rocksDB options on nautilus 14.2.16. osd.178 spilled over 102 GiB metadata from 'db' device (24 GiB used of 50 GiB) to slow device osd.180 spilled over 91 GiB metadata from 'db' device (33 GiB used of 50 GiB) to slow device The problem is, I have the spill over warnings like the rest of the community. I tuned RocksDB Options with the settings below but the problem still exists and I wonder if I did anything wrong. I still have the Spill Overs and also some times index SSD's are getting down due to compaction problems and can not start them until I do offline compaction. Let me tell you about my hardware right? Every server in my system has: HDD - 19 x TOSHIBA MG08SCA16TEY 16.0TB for EC pool. SSD -3 x SAMSUNG MZILS960HEHP/007 GXL0 960GB NVME - 2 x PM1725b 1.6TB I'm using Raid 1 Nvme for Bluestore DB. I dont have WAL. 19*50GB = 950GB total usage on NVME. (I was thinking use the rest but regret it now) So! Finally let's check my RocksDB Options: [osd] bluefs_buffered_io = true bluestore_rocksdb_options = compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,flusher_threads=8,compaction_readahead_size=2MB,compaction_threads=16, *max_bytes_for_level_base=536870912*, *max_bytes_for_level_multiplier=10* *"ceph osd df tree" *to see ssd and hdd usage, omap and meta. > ID CLASS WEIGHT REWEIGHT SIZERAW USE DATAOMAPMETA > AVAIL %USE VAR PGS STATUS TYPE NAME > -28280.04810- 280 TiB 169 TiB 166 TiB 688 GiB 2.4 TiB 111 > TiB 60.40 1.00 -host MHNX1 > 178 hdd 14.60149 1.0 15 TiB 8.6 TiB 8.5 TiB 44 KiB 126 GiB 6.0 > TiB 59.21 0.98 174 up osd.178 > 179 ssd0.87329 1.0 894 GiB 415 GiB 89 GiB 321 GiB 5.4 GiB 479 > GiB 46.46 0.77 104 up osd.179 I know the size of NVME is not suitable for 16TB HDD's. I should have more but the expense is cutting us pieces. Because of that I think I'll see the spill overs no matter what I do. But maybe I will make it better with your help! *My questions are:* 1- What is the meaning of (33 GiB used of 50 GiB) 2- Why it's not 50GiB / 50GiB ? 3- Do I have 17GiB unused area on the DB partition? 4- Is there anything wrong with my Rocksdb options? 5- How can I be sure and find the good Rocksdb Options for Ceph? 6- How can I measure the change and test it? 7- Do I need different RocksDB options for HDD's and SSD's ? 8- If I stop using Nvme Raid1 to gain x2 size and resize the DB's to 160GiB. Is it worth to take Nvme faulty? Because I will lose 10HDD at the same time but I have 10 Node and that's only %5 of the EC data . I use m=8 k=2. P.S: There are so many people asking and searching around this. I hope it will work this time. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Safe value for maximum speed backfilling
you can see example below of changing it on the fly sudo ceph tell osd.\* injectargs '--osd_max_backfills 4' sudo ceph tell osd.\* injectargs '--osd_heartbeat_interval 15' sudo ceph tell osd.\* injectargs '--osd_recovery_max_active 4' sudo ceph tell osd.\* injectargs '--osd_recovery_op_priority 63' sudo ceph tell osd.\* injectargs '--osd_client_op_priority 3' בתאריך יום ב׳, 20 בספט׳ 2021 ב-17:29 מאת Szabo, Istvan (Agoda) < istvan.sz...@agoda.com>: > Hi, > > 7 node, ec 4:2 host based crush, ssd osds with nvme wal+db, what shouldn't > cause any issue with these values? > > osd_max_backfills = 1 > osd_recovery_max_active = 1 > osd_recovery_op_priority = 1 > > I want to speed it up but haven't really found any reference. > > Ty > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: etcd support
hi in general i see nothing you can do except supporting ssd's (like having a pool of ssd's in your ceph cluster or use other shared storage with ssd's0 option we give (not production wize just for lab testing who ssuffers from lack of Hardware) - is using memory (of the vm) for etcd this way the perfromance of etcd will boost but again not good for production :) Regards kobi ginon בתאריך יום ב׳, 20 בספט׳ 2021 ב-20:59 מאת Tony Liu < tonyliu0...@hotmail.com>: > Hi, > > I wonder if anyone could share some experiences in etcd support by Ceph. > My users build Kubernetes cluster in VMs on OpenStack with Ceph. > With HDD (DB/WAL on SSD) volume, etcd performance test fails sometimes > because of latency. With SSD (all SSD) volume, it works fine. > I wonder if there is anything I can improve with HDD volume, or it has to > be > SSD volume to support etcd? > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"
Okay - I've finally got full debug logs from the flapping OSDs. The raw logs are both 100M each - I can email them directly if necessary. (Igor I've already sent these your way.) Both flapping OSDs are reporting the same "bluefs _allocate failed to allocate" errors as before. I've also noticed additional errors about corrupt blocks which I haven't noticed previously. E.g. 2021-09-08T10:42:13.316+ 7f705c4f2f00 3 rocksdb: [table/block_based_table_reader.cc:1117] Encountered error while reading data from compression dictionary block Corruption: block checksum mismatch: expected 0, got 2324967111 in db/501397.sst offset 18446744073709551615 size 18446744073709551615 FTR (I realised I never posted this before) our osd tree is: [qs-admin@condor_sc0 ~]$ sudo docker exec fe4eb75fc98b ceph osd tree ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF -1 1.02539 root default -7 0.34180 host condor_sc0 1ssd 0.34180 osd.1 down 0 1.0 -5 0.34180 host condor_sc1 0ssd 0.34180 osd.0up 1.0 1.0 -3 0.34180 host condor_sc2 2ssd 0.34180 osd.2 down 1.0 1.0 I've still not managed to get the ceph-bluestore-tool output - will get back to you on that. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Safe value for maximum speed backfilling
Hi, 7 node, ec 4:2 host based crush, ssd osds with nvme wal+db, what shouldn't cause any issue with these values? osd_max_backfills = 1 osd_recovery_max_active = 1 osd_recovery_op_priority = 1 I want to speed it up but haven't really found any reference. Ty ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Getting cephadm "stderr:Inferring config" every minute in log - for a monitor that doesn't exist and shouldn't exist
That may be pointing in the right direction - I see { "style": "legacy", "name": "mon.rhel1.robeckert.us", "fsid": "fe3a7cb0-69ca-11eb-8d45-c86000d08867", "systemd_unit": "ceph-...@rhel1.robeckert.us", "enabled": false, "state": "stopped", "host_version": "16.2.5" }, And { "style": "cephadm:v1", "name": "mon.rhel1", "fsid": "fe3a7cb0-69ca-11eb-8d45-c86000d08867", "systemd_unit": "ceph-fe3a7cb0-69ca-11eb-8d45-c86000d08867@mon.rhel1", "enabled": true, "state": "running", "service_name": "mon", "ports": [], "ip": null, "deployed_by": [ "quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac", "quay.io/ceph/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d37d7a9b37db1e0ff6691aae6466530" ], "rank": null, "rank_generation": null, "memory_request": null, "memory_limit": null, "container_id": null, "container_image_name": "quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac", "container_image_id": null, "container_image_digests": null, "version": null, "started": null, "created": "2021-09-20T15:46:42.166486Z", "deployed": "2021-09-20T15:46:41.136498Z", "configured": "2021-09-20T15:47:23.002007Z" } As the output. In /var/lib/ceph/mon (not /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon), there is a link: ceph-rhel1.robeckert.us -> /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/ I removed the link and the error did clear up. (hopefully it will stay gone :-)) Thanks, Rob -Original Message- From: Fyodor Ustinov Sent: Monday, September 20, 2021 2:01 PM To: Robert W. Eckert Cc: ceph-users Subject: Re: [ceph-users] Getting cephadm "stderr:Inferring config" every minute in log - for a monitor that doesn't exist and shouldn't exist Hi! It looks exactly the same as the problem I had. Try the `cephadm ls` command on the `rhel1.robeckert.us` node. - Original Message - > From: "Robert W. Eckert" > To: "ceph-users" > Sent: Monday, 20 September, 2021 18:28:08 > Subject: [ceph-users] Getting cephadm "stderr:Inferring config" every > minute in log - for a monitor that doesn't exist and shouldn't exist > Hi- after the upgrade to 16.2.6, I am now seeing this error: > > 9/20/21 10:45:00 AM[ERR]cephadm exited with an error code: 1, > stderr:Inferring config > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert > .us/config > ERROR: [Errno 2] No such file or directory: > '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' > Traceback (most recent call last): File > "/usr/share/ceph/mgr/cephadm/serve.py", > line 1366, in _remote_connection yield (conn, connr) File > "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm > code, > '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm > exited with an error code: 1, stderr:Inferring config > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert > .us/config > ERROR: [Errno 2] No such file or directory: > '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' > > The rhel1 server has a monitor under > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 , and it > is up and active. If I copy the > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 to > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert > .us the error clears, then cephadm removes the folder with the domain > name, and the error starts showing up in the log again. > > After a few minutes, I get the all clear: > > 9/20/21 11:00:00 AM[INF]overall HEALTH_OK > > 9/20/21 10:58:38 AM[INF]Removing key for mon. > > 9/20/21 10:58:37 AM[INF]Removing daemon mon.rhel1.robeckert.us from > rhel1.robeckert.us > > 9/20/21 10:58:37 AM[INF]Removing monitor rhel1.robeckert.us from monmap... > > 9/20/21 10:58:37 AM[INF]Safe to remove mon.rhel1.robeckert.us: not in > monmap (['rhel1', 'story', 'cube']) > > 9/20/21 10:52:21 AM[INF]Cluster is now healthy > > 9/20/21 10:52:21 AM[INF]Health check cleared: CEPHADM_REFRESH_FAILED (was: > failed to probe daemons or devices) > > 9/20/21 10:51:15 AM > > > I checked all of the configurations and can't find any reason it wants > the monitor with the domain. > > But then the errors start up again - I haven't found any messages > before they start up, I am going to monitor more closely. > This doesn't seem to affect any functionality, just lots of messages in the > log. > > Thanks, > Rob > > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io ___ cep
[ceph-users] Re: Getting cephadm "stderr:Inferring config" every minute in log - for a monitor that doesn't exist and shouldn't exist
Hi! It looks exactly the same as the problem I had. Try the `cephadm ls` command on the `rhel1.robeckert.us` node. - Original Message - > From: "Robert W. Eckert" > To: "ceph-users" > Sent: Monday, 20 September, 2021 18:28:08 > Subject: [ceph-users] Getting cephadm "stderr:Inferring config" every minute > in log - for a monitor that doesn't exist > and shouldn't exist > Hi- after the upgrade to 16.2.6, I am now seeing this error: > > 9/20/21 10:45:00 AM[ERR]cephadm exited with an error code: 1, stderr:Inferring > config > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config > ERROR: [Errno 2] No such file or directory: > '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' > Traceback (most recent call last): File > "/usr/share/ceph/mgr/cephadm/serve.py", > line 1366, in _remote_connection yield (conn, connr) File > "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, > '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited > with > an error code: 1, stderr:Inferring config > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config > ERROR: [Errno 2] No such file or directory: > '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' > > The rhel1 server has a monitor under > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 , and it is up > and > active. If I copy the > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 to > /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us the > error clears, then cephadm removes the folder with the domain name, and the > error starts showing up in the log again. > > After a few minutes, I get the all clear: > > 9/20/21 11:00:00 AM[INF]overall HEALTH_OK > > 9/20/21 10:58:38 AM[INF]Removing key for mon. > > 9/20/21 10:58:37 AM[INF]Removing daemon mon.rhel1.robeckert.us from > rhel1.robeckert.us > > 9/20/21 10:58:37 AM[INF]Removing monitor rhel1.robeckert.us from monmap... > > 9/20/21 10:58:37 AM[INF]Safe to remove mon.rhel1.robeckert.us: not in monmap > (['rhel1', 'story', 'cube']) > > 9/20/21 10:52:21 AM[INF]Cluster is now healthy > > 9/20/21 10:52:21 AM[INF]Health check cleared: CEPHADM_REFRESH_FAILED (was: > failed to probe daemons or devices) > > 9/20/21 10:51:15 AM > > > I checked all of the configurations and can't find any reason it wants the > monitor with the domain. > > But then the errors start up again - I haven't found any messages before they > start up, I am going to monitor more closely. > This doesn't seem to affect any functionality, just lots of messages in the > log. > > Thanks, > Rob > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] etcd support
Hi, I wonder if anyone could share some experiences in etcd support by Ceph. My users build Kubernetes cluster in VMs on OpenStack with Ceph. With HDD (DB/WAL on SSD) volume, etcd performance test fails sometimes because of latency. With SSD (all SSD) volume, it works fine. I wonder if there is anything I can improve with HDD volume, or it has to be SSD volume to support etcd? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
I was doing a rolling upgrade from 14.2.x -> 15.2.x (wait a week) -> 16.2.5. It was the last jump that had the hiccup. I'm doing the 16.2.5 -> .6 upgrade as I type this. So far, so good. -paul On 9/20/21 10:02 AM, David Orman wrote: For clarity, was this on upgrading to 16.2.6 from 16.2.5? Or upgrading from some other release? On Mon, Sep 20, 2021 at 8:33 AM Paul Mezzanini wrote: I got the exact same error on one of my OSDs when upgrading to 16. I used it as an exercise on trying to fix a corrupt rocksdb. A spent a few days of poking with no success. I got mostly tool crashes like you are seeing with no forward progress. I eventually just gave up, purged the OSD, did a smart long test on the drive to be sure and then threw it back in the mix. Been HEALTH OK for a week now after it finished refilling the drive. On 9/19/21 10:47 AM, Andrej Filipcic wrote: 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background compaction error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032, Accumulated background error counts: 1 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 Rocksdb transaction: ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Getting cephadm "stderr:Inferring config" every minute in log - for a monitor that doesn't exist and shouldn't exist
Just after I sent, the error message started again: 9/20/21 11:30:00 AM [WRN] ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' 9/20/21 11:30:00 AM [WRN] host rhel1.robeckert.us `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config 9/20/21 11:30:00 AM [WRN] [WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices 9/20/21 11:30:00 AM [WRN] Health detail: HEALTH_WARN failed to probe daemons or devices 9/20/21 11:29:45 AM [ERR] cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1366, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' 9/20/21 11:28:39 AM [ERR] cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1366, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' 9/20/21 11:27:37 AM [ERR] cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1366, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' 9/20/21 11:26:31 AM [ERR] cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1366, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' 9/20/21 11:25:29 AM [ERR] cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1366, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' 9/20/21 11:24:28 AM [ERR] cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/
[ceph-users] Re: rocksdb corruption with 16.2.6
FWIW, we've had similar reports in the past: https://tracker.ceph.com/issues/37282 https://tracker.ceph.com/issues/48002 https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/2GBK5NJFOSQGMN25GQ3CZNX4W2ZGQV5U/?sort=date https://www.spinics.net/lists/ceph-users/msg59466.html https://www.bountysource.com/issues/49313514-block-checksum-mismatch ...but we aren't the only ones: https://github.com/facebook/rocksdb/issues/5251 https://github.com/facebook/rocksdb/issues/7033 https://jira.mariadb.org/browse/MDEV-20456 https://lists.launchpad.net/maria-discuss/msg05614.html https://githubmemory.com/repo/openethereum/openethereum/issues/416 https://githubmemory.com/repo/FISCO-BCOS/FISCO-BCOS/issues/1895 https://groups.google.com/g/rocksdb/c/gUD4kCGTw-0/m/uLpFwkO5AgAJ At least in one case for us, the user was using consumer grade SSDs without power loss protection. I don't think we ever fully diagnosed if that was the cause though. Another case potentially was related to high memory usage on the node. Hardware errors are a legitimate concern here so probably checking dmesg/smartctl/etc is warranted. ECC memory obviously helps too (or rather the lack of which makes it more difficult to diagnose). For folks that have experienced this, any info you can give related to the HW involved would be helpful. We (and other projects) have seen similar things over the years but this is a notoriously difficult issue to track down given that it could be any one of many different things and it may or may not be our code. Mark On 9/20/21 10:09 AM, Neha Ojha wrote: Can we please create a bluestore tracker issue for this (if one does not exist already), where we can start capturing all the relevant information needed to debug this? Given that this has been encountered in previous 16.2.* versions, it doesn't sound like a regression in 16.2.6 to me, rather an issue in pacific. In any case, we'll prioritize fixing it. Thanks, Neha On Mon, Sep 20, 2021 at 8:03 AM Andrej Filipcic wrote: On 20/09/2021 16:02, David Orman wrote: Same question here, for clarity, was this on upgrading to 16.2.6 from 16.2.5? Or upgrading from some other release? from 16.2.5. but the OSD services were never restarted after upgrade to .5, so it could be a leftover of previous issues. Cheers, Andrej On Mon, Sep 20, 2021 at 8:57 AM Sean wrote: I also ran into this with v16. In my case, trying to run a repair totally exhausted the RAM on the box, and was unable to complete. After removing/recreating the OSD, I did notice that it has a drastically smaller OMAP size than the other OSDs. I don’t know if that actually means anything, but just wanted to mention it in case it does. ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL%USE VAR PGS STATUS TYPE NAME 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB 7.7 TiB 29.81 1.02 34 uposd.14 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB 7.6 TiB 30.03 1.03 35 uposd.16 ~ Sean On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: I got the exact same error on one of my OSDs when upgrading to 16. I used it as an exercise on trying to fix a corrupt rocksdb. A spent a few days of poking with no success. I got mostly tool crashes like you are seeing with no forward progress. I eventually just gave up, purged the OSD, did a smart long test on the drive to be sure and then threw it back in the mix. Been HEALTH OK for a week now after it finished refilling the drive. On 9/19/21 10:47 AM, Andrej Filipcic wrote: 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background compaction error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032, Accumulated background error counts: 1 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 Rocksdb transaction: ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- _ prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674Fax: +386-1-425-7074 --
[ceph-users] Getting cephadm "stderr:Inferring config" every minute in log - for a monitor that doesn't exist and shouldn't exist
Hi- after the upgrade to 16.2.6, I am now seeing this error: 9/20/21 10:45:00 AM[ERR]cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 1366, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config' The rhel1 server has a monitor under /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 , and it is up and active. If I copy the /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 to /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us the error clears, then cephadm removes the folder with the domain name, and the error starts showing up in the log again. After a few minutes, I get the all clear: 9/20/21 11:00:00 AM[INF]overall HEALTH_OK 9/20/21 10:58:38 AM[INF]Removing key for mon. 9/20/21 10:58:37 AM[INF]Removing daemon mon.rhel1.robeckert.us from rhel1.robeckert.us 9/20/21 10:58:37 AM[INF]Removing monitor rhel1.robeckert.us from monmap... 9/20/21 10:58:37 AM[INF]Safe to remove mon.rhel1.robeckert.us: not in monmap (['rhel1', 'story', 'cube']) 9/20/21 10:52:21 AM[INF]Cluster is now healthy 9/20/21 10:52:21 AM[INF]Health check cleared: CEPHADM_REFRESH_FAILED (was: failed to probe daemons or devices) 9/20/21 10:51:15 AM I checked all of the configurations and can't find any reason it wants the monitor with the domain. But then the errors start up again - I haven't found any messages before they start up, I am going to monitor more closely. This doesn't seem to affect any functionality, just lots of messages in the log. Thanks, Rob ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
Can we please create a bluestore tracker issue for this (if one does not exist already), where we can start capturing all the relevant information needed to debug this? Given that this has been encountered in previous 16.2.* versions, it doesn't sound like a regression in 16.2.6 to me, rather an issue in pacific. In any case, we'll prioritize fixing it. Thanks, Neha On Mon, Sep 20, 2021 at 8:03 AM Andrej Filipcic wrote: > > On 20/09/2021 16:02, David Orman wrote: > > Same question here, for clarity, was this on upgrading to 16.2.6 from > > 16.2.5? Or upgrading > > from some other release? > > from 16.2.5. but the OSD services were never restarted after upgrade to > .5, so it could be a leftover of previous issues. > > Cheers, > Andrej > > > > On Mon, Sep 20, 2021 at 8:57 AM Sean wrote: > >> I also ran into this with v16. In my case, trying to run a repair totally > >> exhausted the RAM on the box, and was unable to complete. > >> > >> After removing/recreating the OSD, I did notice that it has a drastically > >> smaller OMAP size than the other OSDs. I don’t know if that actually > >> means > >> anything, but just wanted to mention it in case it does. > >> > >> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META > >>AVAIL%USE VAR PGS STATUS TYPE NAME > >> 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB > >> 7.7 TiB 29.81 1.02 34 uposd.14 > >> 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB > >> 7.6 TiB 30.03 1.03 35 uposd.16 > >> > >> ~ Sean > >> > >> > >> On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: > >> > >>> I got the exact same error on one of my OSDs when upgrading to 16. I > >>> used it as an exercise on trying to fix a corrupt rocksdb. A spent a few > >>> days of poking with no success. I got mostly tool crashes like you are > >>> seeing with no forward progress. > >>> > >>> I eventually just gave up, purged the OSD, did a smart long test on the > >>> drive to be sure and then threw it back in the mix. Been HEALTH OK for > >>> a week now after it finished refilling the drive. > >>> > >>> > >>> On 9/19/21 10:47 AM, Andrej Filipcic wrote: > >>> > >>> 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: > >>> > >>> [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background > >>> > >>> compaction error: Corruption: block checksum mismatch: expected > >>> > >>> 2427092066, got 4051549320 in db/251935.sst offset 18414386 size > >>> > >>> 4032, Accumulated background error counts: 1 > >>> > >>> 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common > >>> > >>> error: Corruption: block checksum mismatch: expected 2427092066, got > >>> > >>> 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 > >>> > >>> Rocksdb transaction: > >>> > >>> ___ > >>> ceph-users mailing list -- ceph-users@ceph.io > >>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>> > >> ___ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > -- > _ > prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si > Department of Experimental High Energy Physics - F9 > Jozef Stefan Institute, Jamova 39, P.o.Box 3000 > SI-1001 Ljubljana, Slovenia > Tel.: +386-1-477-3674Fax: +386-1-425-7074 > - > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
On 20/09/2021 16:02, David Orman wrote: Same question here, for clarity, was this on upgrading to 16.2.6 from 16.2.5? Or upgrading from some other release? from 16.2.5. but the OSD services were never restarted after upgrade to .5, so it could be a leftover of previous issues. Cheers, Andrej On Mon, Sep 20, 2021 at 8:57 AM Sean wrote: I also ran into this with v16. In my case, trying to run a repair totally exhausted the RAM on the box, and was unable to complete. After removing/recreating the OSD, I did notice that it has a drastically smaller OMAP size than the other OSDs. I don’t know if that actually means anything, but just wanted to mention it in case it does. ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL%USE VAR PGS STATUS TYPE NAME 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB 7.7 TiB 29.81 1.02 34 uposd.14 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB 7.6 TiB 30.03 1.03 35 uposd.16 ~ Sean On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: I got the exact same error on one of my OSDs when upgrading to 16. I used it as an exercise on trying to fix a corrupt rocksdb. A spent a few days of poking with no success. I got mostly tool crashes like you are seeing with no forward progress. I eventually just gave up, purged the OSD, did a smart long test on the drive to be sure and then threw it back in the mix. Been HEALTH OK for a week now after it finished refilling the drive. On 9/19/21 10:47 AM, Andrej Filipcic wrote: 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background compaction error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032, Accumulated background error counts: 1 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 Rocksdb transaction: ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- _ prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674Fax: +386-1-425-7074 - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
In my case it happened after upgrading from v16.2.4 to v16.2.5 a couple months ago. ~ Sean On Sep 20, 2021 at 9:02:45 AM, David Orman wrote: > Same question here, for clarity, was this on upgrading to 16.2.6 from > 16.2.5? Or upgrading > from some other release? > > On Mon, Sep 20, 2021 at 8:57 AM Sean wrote: > > > I also ran into this with v16. In my case, trying to run a repair totally > > exhausted the RAM on the box, and was unable to complete. > > > After removing/recreating the OSD, I did notice that it has a drastically > > smaller OMAP size than the other OSDs. I don’t know if that actually means > > anything, but just wanted to mention it in case it does. > > > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META > > AVAIL%USE VAR PGS STATUS TYPE NAME > > 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB > > 7.7 TiB 29.81 1.02 34 uposd.14 > > 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB > > 7.6 TiB 30.03 1.03 35 uposd.16 > > > ~ Sean > > > > On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: > > > > I got the exact same error on one of my OSDs when upgrading to 16. I > > > used it as an exercise on trying to fix a corrupt rocksdb. A spent a few > > > days of poking with no success. I got mostly tool crashes like you are > > > seeing with no forward progress. > > > > > > I eventually just gave up, purged the OSD, did a smart long test on the > > > drive to be sure and then threw it back in the mix. Been HEALTH OK for > > > a week now after it finished refilling the drive. > > > > > > > > > On 9/19/21 10:47 AM, Andrej Filipcic wrote: > > > > > > 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: > > > > > > [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background > > > > > > compaction error: Corruption: block checksum mismatch: expected > > > > > > 2427092066, got 4051549320 in db/251935.sst offset 18414386 size > > > > > > 4032, Accumulated background error counts: 1 > > > > > > 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common > > > > > > error: Corruption: block checksum mismatch: expected 2427092066, got > > > > > > 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 > > > > > > Rocksdb transaction: > > > > > > ___ > > > ceph-users mailing list -- ceph-users@ceph.io > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSDs unable to mount BlueFS after reboot
On 9/20/21 12:00, Davíð Steinn Geirsson wrote: Does the SAS controller run the latest firmware? As far as I can tell yes. Avago's website does not seem to list these anymore, but they are running firmware version 20 which is the latest I can find references to in a web search. This machine has been chugging along like this for years (it was a single- node ZFS NFS server before) and I've never had any such issues before. I'm not sure what your failure domain is, but I would certainly want to try to reproduce this issue. I'd be interested to hear any ideas you have about that. The failure domain is host[1], but this is a 3-node cluster so there isn't much room for taking a machine down for longer periods. Taking OSDs down is no problem. Reboot for starters. And a "yank the power cord" next. The two other machines in the cluster have very similar hardware and software so I am concerned about seeing the same there on reboot. Backfilling these 16TB spinners takes a long time and is still running, I'm not going to reboot either of the other nodes until that is finished. Yeah, definitely don't reboot any other node until cluster is HEALTH_OK. But that's also the point, if those 3 hosts are all in the same rack and connected to the same power bar, sooner or later this might happen involuntarily. And if there is important data on there, you want to take mitigate the risks now, not when it's too late. Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
Same question here, for clarity, was this on upgrading to 16.2.6 from 16.2.5? Or upgrading from some other release? On Mon, Sep 20, 2021 at 8:57 AM Sean wrote: > > I also ran into this with v16. In my case, trying to run a repair totally > exhausted the RAM on the box, and was unable to complete. > > After removing/recreating the OSD, I did notice that it has a drastically > smaller OMAP size than the other OSDs. I don’t know if that actually means > anything, but just wanted to mention it in case it does. > > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META > AVAIL%USE VAR PGS STATUS TYPE NAME > 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB > 7.7 TiB 29.81 1.02 34 uposd.14 > 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB > 7.6 TiB 30.03 1.03 35 uposd.16 > > ~ Sean > > > On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: > > > I got the exact same error on one of my OSDs when upgrading to 16. I > > used it as an exercise on trying to fix a corrupt rocksdb. A spent a few > > days of poking with no success. I got mostly tool crashes like you are > > seeing with no forward progress. > > > > I eventually just gave up, purged the OSD, did a smart long test on the > > drive to be sure and then threw it back in the mix. Been HEALTH OK for > > a week now after it finished refilling the drive. > > > > > > On 9/19/21 10:47 AM, Andrej Filipcic wrote: > > > > 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: > > > > [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background > > > > compaction error: Corruption: block checksum mismatch: expected > > > > 2427092066, got 4051549320 in db/251935.sst offset 18414386 size > > > > 4032, Accumulated background error counts: 1 > > > > 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common > > > > error: Corruption: block checksum mismatch: expected 2427092066, got > > > > 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 > > > > Rocksdb transaction: > > > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
For clarity, was this on upgrading to 16.2.6 from 16.2.5? Or upgrading from some other release? On Mon, Sep 20, 2021 at 8:33 AM Paul Mezzanini wrote: > > I got the exact same error on one of my OSDs when upgrading to 16. I > used it as an exercise on trying to fix a corrupt rocksdb. A spent a few > days of poking with no success. I got mostly tool crashes like you are > seeing with no forward progress. > > I eventually just gave up, purged the OSD, did a smart long test on the > drive to be sure and then threw it back in the mix. Been HEALTH OK for > a week now after it finished refilling the drive. > > > On 9/19/21 10:47 AM, Andrej Filipcic wrote: > > 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: > > [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background > > compaction error: Corruption: block checksum mismatch: expected > > 2427092066, got 4051549320 in db/251935.sst offset 18414386 size > > 4032, Accumulated background error counts: 1 > > 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common > > error: Corruption: block checksum mismatch: expected 2427092066, got > > 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 > > Rocksdb transaction: > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSDs unable to mount BlueFS after reboot
On Mon, Sep 20, 2021 at 10:38:37AM +0200, Stefan Kooman wrote: > On 9/16/21 13:42, Davíð Steinn Geirsson wrote: > > > > > The 4 affected drives are of 3 different types from 2 different vendors: > > ST16000NM001G-2KK103 > > ST12000VN0007-2GS116 > > WD60EFRX-68MYMN1 > > > > They are all connected through an LSI2308 SAS controller in IT mode. Other > > drives that did not fail are also connected to the same controller. > > > > There are no expanders in this particular machine, only a direct-attach > > SAS backplane. > > Does the SAS controller run the latest firmware? As far as I can tell yes. Avago's website does not seem to list these anymore, but they are running firmware version 20 which is the latest I can find references to in a web search. This machine has been chugging along like this for years (it was a single- node ZFS NFS server before) and I've never had any such issues before. > > I'm not sure what your failure domain is, but I would certainly want to try > to reproduce this issue. I'd be interested to hear any ideas you have about that. The failure domain is host[1], but this is a 3-node cluster so there isn't much room for taking a machine down for longer periods. Taking OSDs down is no problem. The two other machines in the cluster have very similar hardware and software so I am concerned about seeing the same there on reboot. Backfilling these 16TB spinners takes a long time and is still running, I'm not going to reboot either of the other nodes until that is finished. > > Gr. Stefan Regards, Davíð [1] Mostly. Failure domain is host for every pool using the default CRUSH rules. There is also an EC pool with m=5 k=7, with a custom CRUSH rule to pick 3 hosts and 4 OSDs from each of the hosts. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
I also ran into this with v16. In my case, trying to run a repair totally exhausted the RAM on the box, and was unable to complete. After removing/recreating the OSD, I did notice that it has a drastically smaller OMAP size than the other OSDs. I don’t know if that actually means anything, but just wanted to mention it in case it does. ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL%USE VAR PGS STATUS TYPE NAME 14 hdd10.91409 1.0 11 TiB 3.3 TiB 3.2 TiB 4.6 MiB 5.4 GiB 7.7 TiB 29.81 1.02 34 uposd.14 16 hdd10.91409 1.0 11 TiB 3.3 TiB 3.3 TiB 20 KiB 9.4 GiB 7.6 TiB 30.03 1.03 35 uposd.16 ~ Sean On Sep 20, 2021 at 8:27:39 AM, Paul Mezzanini wrote: > I got the exact same error on one of my OSDs when upgrading to 16. I > used it as an exercise on trying to fix a corrupt rocksdb. A spent a few > days of poking with no success. I got mostly tool crashes like you are > seeing with no forward progress. > > I eventually just gave up, purged the OSD, did a smart long test on the > drive to be sure and then threw it back in the mix. Been HEALTH OK for > a week now after it finished refilling the drive. > > > On 9/19/21 10:47 AM, Andrej Filipcic wrote: > > 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: > > [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background > > compaction error: Corruption: block checksum mismatch: expected > > 2427092066, got 4051549320 in db/251935.sst offset 18414386 size > > 4032, Accumulated background error counts: 1 > > 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common > > error: Corruption: block checksum mismatch: expected 2427092066, got > > 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 > > Rocksdb transaction: > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
I got the exact same error on one of my OSDs when upgrading to 16. I used it as an exercise on trying to fix a corrupt rocksdb. A spent a few days of poking with no success. I got mostly tool crashes like you are seeing with no forward progress. I eventually just gave up, purged the OSD, did a smart long test on the drive to be sure and then threw it back in the mix. Been HEALTH OK for a week now after it finished refilling the drive. On 9/19/21 10:47 AM, Andrej Filipcic wrote: 2021-09-19T15:47:13.610+0200 7f8bc1f0e700 2 rocksdb: [db_impl/db_impl_compaction_flush.cc:2344] Waiting after background compaction error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032, Accumulated background error counts: 1 2021-09-19T15:47:13.636+0200 7f8bbacf1700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 2427092066, got 4051549320 in db/251935.sst offset 18414386 size 4032 code = 2 Rocksdb transaction: ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSDs unable to mount BlueFS after reboot
On 9/16/21 13:42, Davíð Steinn Geirsson wrote: The 4 affected drives are of 3 different types from 2 different vendors: ST16000NM001G-2KK103 ST12000VN0007-2GS116 WD60EFRX-68MYMN1 They are all connected through an LSI2308 SAS controller in IT mode. Other drives that did not fail are also connected to the same controller. There are no expanders in this particular machine, only a direct-attach SAS backplane. Does the SAS controller run the latest firmware? I'm not sure what your failure domain is, but I would certainly want to try to reproduce this issue. Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Drop of performance after Nautilus to Pacific upgrade
We tested Ceph 16.2.6, and indeed, the performances came back to what we expect for this cluster. Luis Domingues ‐‐‐ Original Message ‐‐‐ On Saturday, September 11th, 2021 at 9:55 AM, Luis Domingues wrote: > Hi Igor, > > I have a SSD for the physical DB volume. And indeed it has very high > utilisation during the benchmark. I will test 16.2.6. > > Thanks, > > Luis Domingues > > ‐‐‐ Original Message ‐‐‐ > > On Friday, September 10th, 2021 at 5:57 PM, Igor Fedotov ifedo...@suse.de > wrote: > > > Hi Luis, > > > > some chances that you're hit by https://tracker.ceph.com/issues/52089. > > > > What is your physical DB volume configuration - are there fast > > > > standalone disks for that? If so are they showing high utilization > > > > during the benchmark? > > > > It makes sense to try 16.2.6 once available - would the problem go away? > > > > Thanks, > > > > Igor > > > > On 9/5/2021 8:45 PM, Luis Domingues wrote: > > > > > Hello, > > > > > > I run a test cluster of 3 machines with 24 HDDs each, running bare-metal > > > on CentOS 8. Long story short, I can have a bandwidth of ~ 1'200 MB/s > > > when I do a rados bench, writing objects of 128k, when the cluster is > > > installed with Nautilus. > > > > > > When I upgrade the cluster to Pacific, (using ceph-ansible to deploy > > > and/or upgrade), my performances drop to ~400 MB/s of bandwidth doing the > > > same rados bench. > > > > > > I am kind of clueless on what makes the performance drop so much. Does > > > someone have some ideas where I can dig to find the root of this > > > difference? > > > > > > Thanks, > > > > > > Luis Domingues > > > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: PGs stuck in unkown state
Thank you Stefan! My problem was that the ruleset I built had the failure domain set to rack, when I do not have any racks defined. I changed the failure domain to host as this is just a home lab environment. I reverted the ruleset on the pool, and it immediately started to recover and storage was available to my virtual machines again. I then fixed the failure domain and the ruleset works. On 9/20/21 01:37, Stefan Kooman wrote: On 9/20/21 07:51, Mr. Gecko wrote: Hello, I'll start by explaining what I have done. I was adding some new storage in attempt to setup a cache pool according to https://docs.ceph.com/en/latest/dev/cache-pool/ by doing the following. 1. I upgraded all servers in cluster to ceph 15.2.14 which put the system into recovery for out of sync data. 2. I added 2 SSDs as OSDs to the cluster which immediately cause ceph to balance onto the SSDs. 3. I added 2 new crush rules which map to SSD storage vs HDD storage.\ I guess this is were things go wrong. Have you tested the CRUSH rules beforehand? To see if the right OSDs get mapped, or any at all. I would revert the crush rule change for now to try get your PGs active+clean. If that works, than try to find out (with crushtool for example) why the new CRUSH rule sets do not map the OSDs. Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Adding cache tier to an existing objectstore cluster possible?
These are the processes in the iotop in 1 node. I think it's compacting but it is always like this, never finish. 59936 be/4 ceph0.00 B/s 10.08 M/s 0.00 % 53.07 % ceph-osd -f --cluster ceph --id 46 --setuser ceph --setgroup ceph [bstore_kv_sync] 66097 be/4 ceph0.00 B/s6.96 M/s 0.00 % 43.11 % ceph-osd -f --cluster ceph --id 48 --setuser ceph --setgroup ceph [bstore_kv_sync] 63145 be/4 ceph0.00 B/s5.82 M/s 0.00 % 40.49 % ceph-osd -f --cluster ceph --id 47 --setuser ceph --setgroup ceph [bstore_kv_sync] 51150 be/4 ceph0.00 B/s3.21 M/s 0.00 % 10.50 % ceph-osd -f --cluster ceph --id 43 --setuser ceph --setgroup ceph [bstore_kv_sync] 53909 be/4 ceph0.00 B/s2.91 M/s 0.00 % 9.98 % ceph-osd -f --cluster ceph --id 44 --setuser ceph --setgroup ceph [bstore_kv_sync] 57066 be/4 ceph0.00 B/s2.18 M/s 0.00 % 8.66 % ceph-osd -f --cluster ceph --id 45 --setuser ceph --setgroup ceph [bstore_kv_sync] 36672 be/4 ceph0.00 B/s2.68 M/s 0.00 % 7.82 % ceph-osd -f --cluster ceph --id 42 --setuser ceph --setgroup ceph [bstore_kv_sync] Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Stefan Kooman Sent: Monday, September 20, 2021 2:13 PM To: Szabo, Istvan (Agoda) ; ceph-users Subject: Re: [ceph-users] Adding cache tier to an existing objectstore cluster possible? Email received from the internet. If in doubt, don't click any link nor open any attachment ! On 9/20/21 06:15, Szabo, Istvan (Agoda) wrote: > Hi, > > I'm running out of idea why my wal+db nvmes are maxed out always so thinking > of I might missed the cache tiering in front of my 4:2 ec-pool. IS it > possible to add it later? Maybe I missed a post where you talked about WAL+DB being maxed out. What Ceph version do you use? Maybe you suffer from issue #52244 which is ifxed in Pacific 16.2.6 with PR [1]. Gr. Stefan [1]: https://github.com/ceph/ceph/pull/42773 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Adding cache tier to an existing objectstore cluster possible?
On 9/20/21 06:15, Szabo, Istvan (Agoda) wrote: Hi, I'm running out of idea why my wal+db nvmes are maxed out always so thinking of I might missed the cache tiering in front of my 4:2 ec-pool. IS it possible to add it later? Maybe I missed a post where you talked about WAL+DB being maxed out. What Ceph version do you use? Maybe you suffer from issue #52244 which is ifxed in Pacific 16.2.6 with PR [1]. Gr. Stefan [1]: https://github.com/ceph/ceph/pull/42773 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: PGs stuck in unkown state
On 9/20/21 07:51, Mr. Gecko wrote: Hello, I'll start by explaining what I have done. I was adding some new storage in attempt to setup a cache pool according to https://docs.ceph.com/en/latest/dev/cache-pool/ by doing the following. 1. I upgraded all servers in cluster to ceph 15.2.14 which put the system into recovery for out of sync data. 2. I added 2 SSDs as OSDs to the cluster which immediately cause ceph to balance onto the SSDs. 3. I added 2 new crush rules which map to SSD storage vs HDD storage.\ I guess this is were things go wrong. Have you tested the CRUSH rules beforehand? To see if the right OSDs get mapped, or any at all. I would revert the crush rule change for now to try get your PGs active+clean. If that works, than try to find out (with crushtool for example) why the new CRUSH rule sets do not map the OSDs. Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Adding cache tier to an existing objectstore cluster possible?
Hi, I'm running out of idea why my wal+db nvmes are maxed out always so thinking of I might missed the cache tiering in front of my 4:2 ec-pool. IS it possible to add it later? There are 9 nodes with 6x 15.3TB SAS ssds, 3x nvme drives. Currently out of the 3 nvme 1 is used for index pool and meta pool, the other 2 is used for wal+db in front of 3-3 ssds. Thinking to remove the wal+db nvmes and add it as a write back cache pool. The only thing which makes head ache is the description: https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution feels like not really suggested to use it :/ Any experience with it? Thank you. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?
On 17.09.2021 16:10, Eugen Block wrote: Since I'm trying to test different erasure encoding plugin and technique I don't want the balancer active. So I tried setting it to none as Eguene suggested, and to my surprise I did not get any degraded messages at all, and the cluster was in HEALTH_OK the whole time. Interesting, maybe the balancer works differently now? Or it works differently under heavy load? It would be strange that the balancer normal operation is to put the cluster in degraded mode. The only suspicious lines I see are these: Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.402+ 7f66b0329700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f66b0329700' had timed out after 0.0s But I'm not sure if this is related. The out OSDs shouldn't have any impact on this test. Did you monitor the network saturation during these tests with iftop or something similar? I did not, so I rerun the test this morning. All the servers have 2x25Gbit/s NIC in bonding with LACP 802.3ad layer3+4. The peak on the active monitor was 27 Mbit/s and less on the other 2 monitors. I also checked the CPU(Xeon 5222 3.8 GHz) and non of the cores was saturated, and network statistics show no errors or drops. So perhaps there is a bug in the balancer code? -- Kai Stian Olstad ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: debugging radosgw sync errors
Ah found it. It was a SSL certificate that was invalid (some PoC which started to mold). Now the sync is running fine, but there is one bucket that got a ton of data in the mdlog. [root@s3db16 ~]# radosgw-admin mdlog list | grep temonitor | wc -l No --period given, using current period=e8fc96f1-ae86-4dc1-b432-470b0772fded 284760 [root@s3db16 ~]# radosgw-admin mdlog list | grep name | wc -l No --period given, using current period=e8fc96f1-ae86-4dc1-b432-470b0772fded 343078 Is it safe to clear the mdlog? Am Mo., 20. Sept. 2021 um 01:00 Uhr schrieb Boris Behrens : > I just deleted the rados object from .rgw.data.root and this removed the > bucket.instance, but this did not solve the problem. > > It looks like there is some access error when I try to radosgw-admin > metadata sync init. > The 403 http response code on the post to the /admin/realm/period endpoint. > > I checked the system_key and added a new system user and set the keys with > zone modify and period update --commit on both sides. > This also did not help. > > After a weekend digging through the mailing list and trying to fix it, I > am totally stuck. > I hope that someone of you people can help me. > > > > > Am Fr., 17. Sept. 2021 um 17:54 Uhr schrieb Boris Behrens : > >> While searching for other things I came across this: >> [root ~]# radosgw-admin metadata list bucket | grep www1 >> "www1", >> [root ~]# radosgw-admin metadata list bucket.instance | grep www1 >> "www1:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.81095307.31103", >> "www1.company.dev", >> [root ~]# radosgw-admin bucket list | grep www1 >> "www1", >> [root ~]# radosgw-admin metadata rm bucket.instance:www1.company.dev >> ERROR: can't remove key: (22) Invalid argument >> >> Maybe this is part of the problem. >> >> Did somebody saw this and know what to do? >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> groüen Saal. >> > > > -- > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groüen Saal. > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
attached it, but did not work, here it is: https://www-f9.ijs.si/~andrej/ceph/ceph-osd.1049.log-20210920.gz Cheers, Andrej On 9/20/21 9:41 AM, Dan van der Ster wrote: On Sun, Sep 19, 2021 at 4:48 PM Andrej Filipcic wrote: I have attached a part of the osd log. Hi Andrej. Did you mean to attach more than the snippets? Could you also send the log of the first startup in 16.2.6 of an now-corrupted osd? Cheers, dan -- _ prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674Fax: +386-1-477-3166 - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rocksdb corruption with 16.2.6
On Sun, Sep 19, 2021 at 4:48 PM Andrej Filipcic wrote: > I have attached a part of the osd log. Hi Andrej. Did you mean to attach more than the snippets? Could you also send the log of the first startup in 16.2.6 of an now-corrupted osd? Cheers, dan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Adding cache tier to an existing objectstore cluster possible?
My experience was that placing DB+WAL on NVME provided a much better and much more consistent boost to a HDD-backed pool than a cache tier. My biggest grief with the cache tier was its unpredictable write performance, when it would cache some writes and then immediately not cache some others seemingly at random, and we couldn't affect this behavior with any settings, both well and not so well documented. Read cache performance was somewhat more predictable, but not nearly at the level our enterprise NVME drives could provide. Then I asked about this on IRC and the feedback I got was basically "it is what it is, avoid using cache tier". Zakhar On Mon, Sep 20, 2021 at 9:56 AM Eugen Block wrote: > And we are quite happy with our cache tier. When we got new HDD OSDs > we tested if things would improve without the tier but we had to stick > to it, otherwise working with our VMs was almost impossible. But this > is an RBD cache so I can't tell how the other protocols perform with a > cache tier. > > > Zitat von Zakhar Kirpichenko : > > > Hi, > > > > You can arbitrarily add or remove the cache tier, there's no problem with > > that. The problem is that cache tier doesn't work well, I tried it in > front > > of replicated and EC-pools with very mixed results: when it worked there > > wasn't as much of a speed/latency benefit as one would expect from > > NVME-based cache, and most of the time it just didn't work with I/O very > > obviously hitting the underlying "cold data" pool for no reason. This > > behavior is likely why cache tier isn't recommended. I eventually > > dismantled the cache tier and used NVME for WAL+DB. > > > > Best regards, > > Zakhar > > > > On Mon, Sep 20, 2021 at 7:16 AM Szabo, Istvan (Agoda) < > > istvan.sz...@agoda.com> wrote: > > > >> Hi, > >> > >> I'm running out of idea why my wal+db nvmes are maxed out always so > >> thinking of I might missed the cache tiering in front of my 4:2 > ec-pool. IS > >> it possible to add it later? > >> There are 9 nodes with 6x 15.3TB SAS ssds, 3x nvme drives. Currently out > >> of the 3 nvme 1 is used for index pool and meta pool, the other 2 is > used > >> for wal+db in front of 3-3 ssds. Thinking to remove the wal+db nvmes and > >> add it as a write back cache pool. > >> > >> The only thing which makes head ache is the description: > >> > https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution > >> feels like not really suggested to use it :/ > >> > >> Any experience with it? > >> > >> Thank you. > >> > >> ___ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > >> > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io