[ceph-users] precise/best way to check ssd usage
Currently I am checking usage on ssd drives with ceph osd df| egrep 'CLASS|ssd' I have a use % between 48% and 57%, and assume that with a node failure 1/3 (only using 3x repl.) of this 57% needs to be able to migrate and added to a different node. Is there a better way of checking this (on old Nautilus)? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Not all Bucket Shards being used
Thank you for the information, Christian. When you reshard the bucket id is updated (with most recent versions of ceph, a generation number is incremented). The first bucket id matches the bucket marker, but after the first reshard they diverge. The bucket id is in the names of the currently used bucket index shards. You’re searching for the marker, which means you’re finding older bucket index shards. Change your commands to these: # rados -p raum.rgw.buckets.index ls \ |grep 3caabb9a-4e3b-4b8a-8222-34c33dd63210.10648356.1 \ |sort -V # rados -p raum.rgw.buckets.index ls \ |grep 3caabb9a-4e3b-4b8a-8222-34c33dd63210.10648356.1 \ |sort -V \ |xargs -IOMAP sh -c \ 'rados -p raum.rgw.buckets.index listomapkeys OMAP | wc -l' When you refer to the “second zone”, what do you mean? Is this cluster using multisite? If and only if your answer is “no”, then it’s safe to remove old bucket index shards. Depending on the version of ceph running when reshard was run, they were either intentionally left behind (earlier behavior) or removed automatically (later behavior). Eric (he/him) > On Jul 25, 2023, at 6:32 AM, Christian Kugler wrote: > > Hi Eric, > >> 1. I recommend that you *not* issue another bucket reshard until you figure >> out what’s going on. > > Thanks, noted! > >> 2. Which version of Ceph are you using? > 17.2.5 > I wanted to get the Cluster to Health OK before upgrading. I didn't > see anything that led me to believe that an upgrade could fix the > reshard issue. > >> 3. Can you issue a `radosgw-admin metadata get bucket:` so we >> can verify what the current marker is? > > # radosgw-admin metadata get bucket:sql20 > { >"key": "bucket:sql20", >"ver": { >"tag": "_hGhtgzjcWY9rO9JP7YlWzt8", >"ver": 3 >}, >"mtime": "2023-07-12T15:56:55.226784Z", >"data": { >"bucket": { >"name": "sql20", >"marker": "3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9", >"bucket_id": "3caabb9a-4e3b-4b8a-8222-34c33dd63210.10648356.1", >"tenant": "", >"explicit_placement": { >"data_pool": "", >"data_extra_pool": "", >"index_pool": "" >} >}, >"owner": "S3user", >"creation_time": "2023-04-26T09:22:01.681646Z", >"linked": "true", >"has_bucket_info": "false" >} > } > >> 4. After you resharded previously, did you get command-line output along the >> lines of: >> 2023-07-24T13:33:50.867-0400 7f10359f2a80 1 execute INFO: reshard of bucket >> “" completed successfully > > I think so, at least for the second reshard. But I wouldn't bet my > life on it. I fear I might have missed an error on the first one since > I have done a radosgw-admin bucket reshard so often and never seen it > fail. > > Christian > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: LARGE_OMAP_OBJECTS warning and bucket has lot of unknown objects and 1999 shards.
There are a couple of potential explanations. 1) Do you have versioning turned on? 1a) And do you write the same file over and over, such as a heartbeat file? 2) Do you have lots of incomplete multipart uploads? If you wouldn’t mind, please run: `radosgw-admin bi list —bucket=epbucket --max-entries=50`and provide the output in a reply. Thanks, Eric (he/him) > On Jul 28, 2023, at 9:25 AM, Uday Bhaskar Jalagam > wrote: > > Hello Everyone , > I am getting [WRN] LARGE_OMAP_OBJECTS: 18 large omap objects warning > in one of my clusters . I see one of the buckets has a huge number of > shards 1999 and "num_objects": 221185360 when I check bucket stats using > radosgw-admin bucket stats . However I see only 8 files when I actually do > list bucket using python boto3. I am not sure what those objects are > missing somewhere . > > # radosgw-admin bucket stats --bucket=epbucket > { >"bucket": "epbucket", >"num_shards": 1999, >"tenant": "", >"zonegroup": "ac361c18-7420-4739-8e92-73109fc8ec80", >"placement_rule": "default-placement", >"explicit_placement": { >"data_pool": "", >"data_extra_pool": "", >"index_pool": "" >}, >"id": "f3e47d62-b9ff-4f38-b863-d7bfc155d698.35744173.268063", >"marker": "f3e47d62-b9ff-4f38-b863-d7bfc155d698.35744173.1225", >"index_type": "Normal", >"owner": "user", >"ver": > "0#3,1#1,2#1,3#9,4#1,5#3,6#1,7#3,8#1,9#8,10#7,11#1,12#3,13#1,14#1,15#1,16#1,17#1,18#5,19#1,20#1,21#3,22#3,23#1,24#3,25#1,26#1,27#1,28#3,29#3,30#2,31#3,32#2,33#1,34#3,35#1,36#3,37#1,38#3,39#1,40#3,41#1,42#1,43#1,44#1,45#1,46#3,47#1,48#3,49#1,50#3,51#1,52#1,53#1,54#1,55#1,56#3,57#1,58#1,59#5,60#1,61#1,62#1,63#3,64#1,65#1,66#3,67#1,68#5,69#1,70#1,71#3,72#1,73#3,74#2,75#3,76#1,77#3,78#1,79#1,80#3,81#3,82#3,83#3,84#1,85#1,86#3,87#3,88#8,89#1,90#1,91#1,92#5,93#7,94#1,95#1,96#3,97#1,98#5,99#1,100#1,101#3,102#5,103#3,104#1,105#1,106#1,107#1,108#1,109#239050,110#1,111#1,112#1,113#5,114#1,115#1,116#1,117#3,118#3,119#9,120#1,121#1,122#3,123#5,124#3,125#3,126#1,127#1,128#1,129#3,130#3,131#3,132#3,133#5,134#1,135#1,136#1,137#1,138#5,139#3,140#1,141#1,142#3,143#3,144#1,145#1,146#3,147#3,148#3,149#1,150#3,151#1,152#1,153#3,154#7,155#1,156#5,157#3,158#3,159#3,160#5,161#3,162#3,163#3,164#3,165#1,166#1,167#5,168#3,169#1,170#1,171#1,172#4,173#1,174#12,175#1,176#3,177#3,178#3,179#1,180#3,181#33,182#1,18 > 3#1,184#1,185#1,186#3,187#3,188#11,189#3,190#5,191#3,192#1,193#5,194#3,195#2,196#3,197#1,198#3,199#1,200#7,201#3,202#7,203#1,204#3,205#3,206#1,207#1,208#3,209#1,210#7,211#3,212#2,213#9,214#1,215#3,216#1,217#1,218#7,219#1,220#1,221#5,222#1,223#1,224#3,225#1,226#1,227#3,228#3,229#1,230#5,231#5,232#1,233#3,234#1,235#1,236#5,237#1,238#1,239#3,240#7,241#3,242#1,243#3,244#1,245#1,246#3,247#1,248#3,249#3,250#1,251#1,252#3,253#1,254#3,255#1,256#3,257#1,258#1,259#7,260#9,261#3,262#1,263#1,264#3,265#1,266#1,267#5,268#1,269#1,270#1,271#1,272#1,273#3,274#1,275#1,276#1,277#3,278#1,279#7,280#1,281#238803,282#3,283#1,284#1,285#1,286#1,287#3,288#1,289#1,290#1,291#16,292#3,293#3,294#1,295#1,296#5,297#3,298#3,299#7,300#5,301#5,302#1,303#1,304#1,305#3,306#3,307#5,308#1,309#1,310#3,311#3,312#1,313#1,314#5,315#3,316#7,317#1,318#1,319#3,320#3,321#3,322#1,323#9,324#3,325#1,326#5,327#1,328#3,329#9,330#3,331#1,332#3,333#3,334#3,335#3,336#3,337#9,338#1,339#1,340#1,341#3,342#3,343#3,344#2,345#3,346#3,347#1,34 > 8#3,349#3,350#5,351#1,352#5,353#8,354#5,355#1,356#3,357#1,358#3,359#1,360#1,361#1,362#1,363#5,364#1,365#1,366#1,367#1,368#1,369#1,370#3,371#3,372#3,373#3,374#1,375#3,376#5,377#8,378#1,379#1,380#1,381#1,382#3,383#3,384#1,385#3,386#1,387#1,388#1,389#2,390#1,391#1,392#1,393#1,394#5,395#5,396#1,397#5,398#3,399#1,400#1,401#7,402#1,403#5,404#1,405#1,406#3,407#3,408#3,409#1,410#1,411#1,412#1,413#1,414#1,415#1,416#1,417#1,418#3,419#3,420#1,421#5,422#5,423#5,424#1,425#1,426#1,427#1,428#9,429#1,430#3,431#1,432#3,433#3,434#1,435#1,436#3,437#5,438#3,439#3,440#1,441#1,442#1,443#1,444#1,445#5,446#3,447#9,448#1,449#1,450#1,451#1,452#1,453#3,454#1,455#1,456#3,457#5,458#5,459#3,460#9,461#1,462#1,463#1,464#1,465#1,466#1,467#3,468#1,469#3,470#8,471#1,472#1,473#1,474#7,475#3,476#1,477#11,478#10,479#1,480#1,481#1,482#3,483#1,484#7,485#1,486#1,487#3,488#8,489#1,490#1,491#8,492#3,493#1,494#3,495#5,496#1,497#1,498#1,499#3,500#1,501#5,502#1,503#1,504#3,505#1,506#1,507#1,508#3,509#3,510#3,511#1,512#7,513#3,5 >
[ceph-users] Re: cephadm logs
Not currently. Those logs aren't generated by any daemons, they come directly from anything done by the cephadm binary one the host, which tends to be quite a bit since the cephadm mgr module runs most of its operations on the host through a copy of the cephadm binary. It doesn't log to journal because it doesn't have a systemd unit or anything, it's just a python script being run directly and nothing has been implemented to make it possible for that to log to journald. On Fri, Jul 28, 2023 at 9:43 AM Luis Domingues wrote: > Hi, > > Quick question about cephadm and its logs. On my cluster I have every logs > that goes to journald. But on each machine, I still have > /var/log/ceph/cephadm.log that is alive. > > Is there a way to make cephadm log to journald instead of a file? If yes > did I miss it on the documentation? Of if not is there any reason to log > into a file while everything else logs to journald? > > Thanks > > Luis Domingues > Proton AG > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Reef release candidate - v18.1.3
This is the third and possibly last release candidate for Reef. The Reef release comes with a new RockDB version (7.9.2) [0], which incorporates several performance improvements and features. Our internal testing doesn't show any side effects from the new version, but we are very eager to hear community feedback on it. This is the first release to have the ability to tune RockDB settings per column family [1], which allows for more granular tunings to be applied to different kinds of data stored in RocksDB. A new set of settings has been used in Reef to optimize performance for most kinds of workloads with a slight penalty in some cases, outweighed by large improvements in use cases such as RGW, in terms of compactions and write amplification. We would highly encourage community members to give these a try against their performance benchmarks and use cases. The detailed list of changes in terms of RockDB and BlueStore can be found in https://pad.ceph.com/p/reef-rc-relnotes. If any of our community members would like to help us with performance investigations or regression testing of the Reef release candidate, please feel free to provide feedback via email or in https://pad.ceph.com/p/reef_scale_testing. For more active discussions, please use the #ceph-at-scale slack channel in ceph-storage.slack.com. This RC has gone thru partial testing due to issues we are experiencing in the sepia lab. Please try it out and report any issues you encounter. Happy testing! Thanks, YuriW Get the release from * Git at git://github.com/ceph/ceph.git * Tarball at https://download.ceph.com/tarballs/ceph-18.1.3.tar.gz * Containers at https://quay.io/repository/ceph/ceph * For packages, see https://docs.ceph.com/en/latest/install/get-packages/ * Release git sha1: f594a0802c34733bb06e5993bc4bdb085c9a5f3f ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] cephadm logs
Hi, Quick question about cephadm and its logs. On my cluster I have every logs that goes to journald. But on each machine, I still have /var/log/ceph/cephadm.log that is alive. Is there a way to make cephadm log to journald instead of a file? If yes did I miss it on the documentation? Of if not is there any reason to log into a file while everything else logs to journald? Thanks Luis Domingues Proton AG ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] LARGE_OMAP_OBJECTS warning and bucket has lot of unknown objects and 1999 shards.
Hello Everyone , I am getting [WRN] LARGE_OMAP_OBJECTS: 18 large omap objects warning in one of my clusters . I see one of the buckets has a huge number of shards 1999 and "num_objects": 221185360 when I check bucket stats using radosgw-admin bucket stats . However I see only 8 files when I actually do list bucket using python boto3. I am not sure what those objects are missing somewhere . # radosgw-admin bucket stats --bucket=epbucket { "bucket": "epbucket", "num_shards": 1999, "tenant": "", "zonegroup": "ac361c18-7420-4739-8e92-73109fc8ec80", "placement_rule": "default-placement", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" }, "id": "f3e47d62-b9ff-4f38-b863-d7bfc155d698.35744173.268063", "marker": "f3e47d62-b9ff-4f38-b863-d7bfc155d698.35744173.1225", "index_type": "Normal", "owner": "user", "ver": "0#3,1#1,2#1,3#9,4#1,5#3,6#1,7#3,8#1,9#8,10#7,11#1,12#3,13#1,14#1,15#1,16#1,17#1,18#5,19#1,20#1,21#3,22#3,23#1,24#3,25#1,26#1,27#1,28#3,29#3,30#2,31#3,32#2,33#1,34#3,35#1,36#3,37#1,38#3,39#1,40#3,41#1,42#1,43#1,44#1,45#1,46#3,47#1,48#3,49#1,50#3,51#1,52#1,53#1,54#1,55#1,56#3,57#1,58#1,59#5,60#1,61#1,62#1,63#3,64#1,65#1,66#3,67#1,68#5,69#1,70#1,71#3,72#1,73#3,74#2,75#3,76#1,77#3,78#1,79#1,80#3,81#3,82#3,83#3,84#1,85#1,86#3,87#3,88#8,89#1,90#1,91#1,92#5,93#7,94#1,95#1,96#3,97#1,98#5,99#1,100#1,101#3,102#5,103#3,104#1,105#1,106#1,107#1,108#1,109#239050,110#1,111#1,112#1,113#5,114#1,115#1,116#1,117#3,118#3,119#9,120#1,121#1,122#3,123#5,124#3,125#3,126#1,127#1,128#1,129#3,130#3,131#3,132#3,133#5,134#1,135#1,136#1,137#1,138#5,139#3,140#1,141#1,142#3,143#3,144#1,145#1,146#3,147#3,148#3,149#1,150#3,151#1,152#1,153#3,154#7,155#1,156#5,157#3,158#3,159#3,160#5,161#3,162#3,163#3,164#3,165#1,166#1,167#5,168#3,169#1,170#1,171#1,172#4,173#1,174#12,175#1,176#3,177#3,178#3,179#1,180#3,181#33,182#1,18 3#1,184#1,185#1,186#3,187#3,188#11,189#3,190#5,191#3,192#1,193#5,194#3,195#2,196#3,197#1,198#3,199#1,200#7,201#3,202#7,203#1,204#3,205#3,206#1,207#1,208#3,209#1,210#7,211#3,212#2,213#9,214#1,215#3,216#1,217#1,218#7,219#1,220#1,221#5,222#1,223#1,224#3,225#1,226#1,227#3,228#3,229#1,230#5,231#5,232#1,233#3,234#1,235#1,236#5,237#1,238#1,239#3,240#7,241#3,242#1,243#3,244#1,245#1,246#3,247#1,248#3,249#3,250#1,251#1,252#3,253#1,254#3,255#1,256#3,257#1,258#1,259#7,260#9,261#3,262#1,263#1,264#3,265#1,266#1,267#5,268#1,269#1,270#1,271#1,272#1,273#3,274#1,275#1,276#1,277#3,278#1,279#7,280#1,281#238803,282#3,283#1,284#1,285#1,286#1,287#3,288#1,289#1,290#1,291#16,292#3,293#3,294#1,295#1,296#5,297#3,298#3,299#7,300#5,301#5,302#1,303#1,304#1,305#3,306#3,307#5,308#1,309#1,310#3,311#3,312#1,313#1,314#5,315#3,316#7,317#1,318#1,319#3,320#3,321#3,322#1,323#9,324#3,325#1,326#5,327#1,328#3,329#9,330#3,331#1,332#3,333#3,334#3,335#3,336#3,337#9,338#1,339#1,340#1,341#3,342#3,343#3,344#2,345#3,346#3,347#1,34 8#3,349#3,350#5,351#1,352#5,353#8,354#5,355#1,356#3,357#1,358#3,359#1,360#1,361#1,362#1,363#5,364#1,365#1,366#1,367#1,368#1,369#1,370#3,371#3,372#3,373#3,374#1,375#3,376#5,377#8,378#1,379#1,380#1,381#1,382#3,383#3,384#1,385#3,386#1,387#1,388#1,389#2,390#1,391#1,392#1,393#1,394#5,395#5,396#1,397#5,398#3,399#1,400#1,401#7,402#1,403#5,404#1,405#1,406#3,407#3,408#3,409#1,410#1,411#1,412#1,413#1,414#1,415#1,416#1,417#1,418#3,419#3,420#1,421#5,422#5,423#5,424#1,425#1,426#1,427#1,428#9,429#1,430#3,431#1,432#3,433#3,434#1,435#1,436#3,437#5,438#3,439#3,440#1,441#1,442#1,443#1,444#1,445#5,446#3,447#9,448#1,449#1,450#1,451#1,452#1,453#3,454#1,455#1,456#3,457#5,458#5,459#3,460#9,461#1,462#1,463#1,464#1,465#1,466#1,467#3,468#1,469#3,470#8,471#1,472#1,473#1,474#7,475#3,476#1,477#11,478#10,479#1,480#1,481#1,482#3,483#1,484#7,485#1,486#1,487#3,488#8,489#1,490#1,491#8,492#3,493#1,494#3,495#5,496#1,497#1,498#1,499#3,500#1,501#5,502#1,503#1,504#3,505#1,506#1,507#1,508#3,509#3,510#3,511#1,512#7,513#3,5 14#1,515#3,516#5,517#1,518#3,519#1,520#3,521#3,522#3,523#5,524#3,525#3,526#1,527#1,528#1,529#1,530#1,531#3,532#1,533#3,534#5,535#5,536#1,537#3,538#3,539#5,540#1,541#1,542#1,543#1,544#7,545#3,546#1,547#1,548#1,549#3,550#1,551#1,552#3,553#1,554#3,555#3,556#1,557#3,558#5,559#3,560#3,561#8,562#1,563#3,564#7,565#3,566#5,567#1,568#3,569#3,570#3,571#3,572#5,573#1,574#1,575#5,576#5,577#3,578#7,579#1,580#1,581#5,582#1,583#3,584#1,585#1,586#9,587#9,588#3,589#1,590#1,591#3,592#3,593#1,594#10,595#3,596#1,597#3,598#1,599#1,600#1,601#3,602#3,603#1,604#3,605#5,606#1,607#1,608#1,609#10,610#3,611#264711,612#3,613#5,614#3,615#3,616#1,617#1,618#7,619#3,620#3,621#1,622#3,623#1,624#3,625#1,626#3,627#1,628#1,629#3,630#1,631#2,632#233833,633#1,634#1,635#5,636#1,637#1,638#1,639#1,640#3,641#1,642#1,643#3,644#1,645#1,646#10,647#1,648#1,649#1,650#3,651#1,652#1,653#5,654#1,655#7,656#7,657#1,658#1,659#5,660#3,661#3,662#1,663#3,664#1,665#1,666#3,667#1,668#7,669#1,670#3,671#3,672#1,673#1,674#1,675#1,676#1,677#3,6
[ceph-users] Re: MON sync time depends on outage duration
Hi, I think we found an explanation for the behaviour, we still need to verify it though. Just wanted to write it up for posterity. We already knew that the large number of "purged_snap" keys in the mon store is responsible for the long synchronization. Removing them didn't seem to have a negative impact in my test cluster, but don't want to try that in production. They also tried a couple of variations with mon_sync_payload_size but it didn't have a significant impact (it impacted a few other keys, but not the osd_snap keys). We seemed to hit the payload_keys limit (default 2000), we'll suggest to increase it and hopefully find a suitable value. But it still didn't explain the variations in the sync duration. So we looked deeper (also dived into the code) and finally got some debug logs we could analyse. The paxos versions determine if a "full sync" is required or a "recent sync" is sufficient: if (paxos->get_version() < m->paxos_first_version && m->paxos_first_version > 1) { dout(10) << " peer paxos first versions [" << m->paxos_first_version << "," << m->paxos_last_version << "]" << " vs my version " << paxos->get_version() << " (too far ahead)" << dendl; ... So if the current version of the to-be-synced mon is lower than the first available version of the peer it starts a full sync, otherwise a recent sync is started. In one of the tests (simulating a mon reboot) the difference between paxos versions was 628. I checked the available mon config options and found "paxos_min" (default 500). This will be the next suggestion, increase paxos_min to 1000 so the cluster doesn't require a full sync after a regular reboot and only do a full sync in case it's down for a longer period of time. Not sure what other impact it could have except for some more storage consumption, but we'll let them test it. But this still doesn't explain the variations in the startup times. My current theory is that the duration depends on the timing of the reboot/daemon shutdown: The rbd-mirror is currently configured with a 30 minute schedule. This means that every full and every half hour new snapshots are created and synced, older snapshots are deleted which impacts the osdmap. So if a MON goes down during this time it's very likely that its paxos version will be lower than the first available on the peer(s). So if a reboot is scheduled after the snapshot schedule the mon synchronisation time probably will decrease. This also needs some varification, still waiting for the results. From my perspective, those two config options (mon_sync_payload_keys, paxos_min) and rebooting a MON server at the right time are the most promising approaches for now. Having the mon store on SSDs would help as well, of course, but unfortunately that's currently not an option. I'll update this thread when we have more results, maybe my theory garbage, but I'm confident. :-) If you have comments or objections regarding those config options, I'd appreciate your comments. Thanks, Eugen Zitat von Josh Baergen : Out of curiosity, what is your require_osd_release set to? (ceph osd dump | grep require_osd_release) Josh On Tue, Jul 11, 2023 at 5:11 AM Eugen Block wrote: I'm not so sure anymore if that could really help here. The dump-keys output from the mon contains 42 million osd_snap prefix entries, 39 million of them are "purged_snap" keys. I also compared to other clusters as well, those aren't tombstones but expected "history" of purged snapshots. So I don't think removing a couple of hundred trash snapshots will actually reduce the number of osd_snap keys. At least doubling the payload_size seems to have a positive impact. The compaction during the sync has a negative impact, of course, same as not having the mon store on SSDs. I'm currently playing with a test cluster, removing all "purged_snap" entries from the mon db (not finished yet) to see what that will do with the mon and if it will even start correctly. But has anyone done that, removing keys from the mon store? Not sure what to expect yet... Zitat von Dan van der Ster : > Oh yes, sounds like purging the rbd trash will be the real fix here! > Good luck! > > __ > Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com > > > > > On Mon, Jul 10, 2023 at 6:10 AM Eugen Block wrote: > >> Hi, >> I got a customer response with payload size 4096, that made things >> even worse. The mon startup time was now around 40 minutes. My doubts >> wrt decreasing the payload size seem confirmed. Then I read Dan's >> response again which also mentions that the default payload size could >> be too small. So I asked them to double the default (2M instead of 1M) >> and am now waiting for a new result. I'm still wondering why this only >> happens when the mon is down for more than 5 minutes. Does
[ceph-users] Re: MDS stuck in rejoin
On 7/26/23 22:13, Frank Schilder wrote: Hi Xiubo. ... I am more interested in the kclient side logs. Just want to know why that oldest request got stuck so long. I'm afraid I'm a bad admin in this case. I don't have logs from the host any more, I would have needed the output of dmesg and this is gone. In case it happens again I will try to pull the info out. The tracker https://tracker.ceph.com/issues/22885 sounds a lot more violent than our situation. We had no problems with the MDSes, the cache didn't grow and the relevant one was also not put into read-only mode. It was just this warning showing all the time, health was OK otherwise. I think the warning was there for at least 16h before I failed the MDS. The MDS log contains nothing, this is the only line mentioning this client: 2023-07-20T00:22:05.518+0200 7fe13df59700 0 log_channel(cluster) log [WRN] : client.145678382 does not advance its oldest_client_tid (16121616), 10 completed requests recorded in session Okay, if so it's hard to say and dig out what has happened in client why it didn't advance the tid. Thanks - Xiubo Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io