[ceph-users] Re: reef 18.2.3 QE validation status

2024-07-11 Thread Venky Shankar
Hi Yuri, On Wed, Jul 10, 2024 at 7:28 PM Yuri Weinstein wrote: > > We built a new branch with all the cherry-picks on top > (https://pad.ceph.com/p/release-cherry-pick-coordination). > > I am rerunning fs:upgrade: >

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2024-07-03 Thread Venky Shankar
er the drop there is steady rise but now > >> these sudden jumps are something new and even more scary :E > >> > >> Here's a fresh 2sec level 20 mds log: > >> https://gist.github.com/olliRJL/074bec65787085e70db8af0ec35f8148 > >> > >> Any help and i

[ceph-users] Re: reef 18.2.3 QE validation status

2024-07-03 Thread Venky Shankar
Hi Yuri, On Tue, Jul 2, 2024 at 7:36 PM Yuri Weinstein wrote: > After fixing the issues identified below we cherry-picked all PRs from > this list for 18.2.3 > https://pad.ceph.com/p/release-cherry-pick-coordination. > > The question to the dev leads: do you think we can proceed with the >

[ceph-users] Re: [EXTERN] Urgent help with degraded filesystem needed

2024-07-03 Thread Venky Shankar
Hi Stefan, On Tue, Jul 2, 2024 at 4:16 PM Stefan Kooman wrote: > > Hi Venky, > > On 02-07-2024 09:45, Venky Shankar wrote: > > Hi Stefan, > > > > On Mon, Jul 1, 2024 at 2:30 PM Stefan Kooman wrote: > >> > >> Hi Dietmar, > >> >

[ceph-users] Re: squid 19.1.0 RC QE validation status

2024-07-03 Thread Venky Shankar
Hi Yuri, On Mon, Jul 1, 2024 at 7:53 PM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/66756#note-1 > > Release Notes - TBD > LRC upgrade - TBD > > (Reruns were not done yet.) > > Seeking approvals/reviews for: > > smoke > rados -

[ceph-users] Re: squid 19.1.0 RC QE validation status

2024-07-03 Thread Venky Shankar
grade/reef-x suite, > which had this RBD failure: > >- https://tracker.ceph.com/issues/63131 - TestMigration.Stress2: >snap3, block 171966464~4194304 differs after migration - RBD > > > @Venky Shankar , please see the powercycle suite, > which had this CephFS failure: >

[ceph-users] Re: [EXTERN] Urgent help with degraded filesystem needed

2024-07-02 Thread Venky Shankar
Hi Stefan, On Mon, Jul 1, 2024 at 2:30 PM Stefan Kooman wrote: > > Hi Dietmar, > > On 29-06-2024 10:50, Dietmar Rieder wrote: > > Hi all, > > > > finally we were able to repair the filesystem and it seems that we did > > not lose any data. Thanks for all suggestions and comments. > > > > Here is

[ceph-users] Re: reef 18.2.3 QE validation status

2024-04-17 Thread Venky Shankar
On Sat, Apr 13, 2024 at 12:08 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/65393#note-1 > Release Notes - TBD > LRC upgrade - TBD > > Seeking approvals/reviews for: > > smoke - infra issues, still trying, Laura PTL > > rados -

[ceph-users] Re: reef 18.2.3 QE validation status

2024-04-17 Thread Venky Shankar
Hi Yuri, On Tue, Apr 16, 2024 at 7:52 PM Yuri Weinstein wrote: > > And approval is needed for: > > fs - Venky approved? fs approved. failures are: https://tracker.ceph.com/projects/cephfs/wiki/Reef#2024-04-17 > powercycle - seems fs related, Venky, Brad PTL > > On Mon, Apr 15, 2024 at 5:55 PM

[ceph-users] Re: reef 18.2.3 QE validation status

2024-04-16 Thread Venky Shankar
On Tue, Apr 16, 2024 at 7:52 PM Yuri Weinstein wrote: > > And approval is needed for: > > fs - Venky approved? Could not get to this today. Will be done tomorrow. > powercycle - seems fs related, Venky, Brad PTL > > On Mon, Apr 15, 2024 at 5:55 PM Yuri Weinstein wrote: > > > > Still waiting

[ceph-users] PSA: CephFS/MDS config defer_client_eviction_on_laggy_osds

2024-03-15 Thread Venky Shankar
If you are using CephFS on Pacific v16.2.14(+), the MDS config `defer_client_eviction_on_laggy_osds' is enabled by default. This config is used to not evict cephfs clients if OSDs are laggy[1]. However, this can result in a single client holding up the MDS in servicing other clients. To avoid

[ceph-users] Re: reef 18.2.2 (hot-fix) QE validation status

2024-03-05 Thread Venky Shankar
+Patrick Donnelly On Tue, Mar 5, 2024 at 9:18 PM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/64721#note-1 > Release Notes - TBD > LRC upgrade - TBD > > Seeking approvals/reviews for: > > smoke - in progress > rados - Radek, Laura? >

[ceph-users] Re: reef 18.2.2 (hot-fix) QE validation status

2024-03-05 Thread Venky Shankar
Hi Laura, On Wed, Mar 6, 2024 at 4:53 AM Laura Flores wrote: > Here are the rados and smoke suite summaries. > > @Radoslaw Zarzynski , @Adam King > , @Nizamudeen A , mind having a look to ensure the > results from the rados suite look good to you? > > @Venky Shanka

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-20 Thread Venky Shankar
Hi Yuri, On Tue, Feb 20, 2024 at 9:29 PM Yuri Weinstein wrote: > > We have restarted QE validation after fixing issues and merging several PRs. > The new Build 3 (rebase of pacific) tests are summarized in the same > note (see Build 3 runs) https://tracker.ceph.com/issues/64151#note-1 > >

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-01-31 Thread Venky Shankar
On Tue, Jan 30, 2024 at 3:08 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/64151#note-1 > > Seeking approvals/reviews for: > > rados - Radek, Laura, Travis, Ernesto, Adam King > rgw - Casey > fs - Venky fs approved. Failures are -

[ceph-users] Re: mgr finish mon failed to return metadata for mds

2023-12-18 Thread Venky Shankar
On Tue, Dec 12, 2023 at 7:58 PM Eugen Block wrote: > > Can you restart the primary MDS (not sure which one it currently is, > should be visible from the mds daemon log) and see if this resolves at > least temporarily? Because after we recovered the cluster and cephfs > we did have output in 'ceph

[ceph-users] Re: Terrible cephfs rmdir performance

2023-12-18 Thread Venky Shankar
Hi Paul, On Wed, Dec 13, 2023 at 9:50 PM Paul Mezzanini wrote: > > Long story short, we've got a lot of empty directories that I'm working on > removing. While removing directories, using "perf top -g" we can watch the > MDS daemon go to 100% cpu usage with "SnapRealm:: split_at" and >

[ceph-users] Re: MDS crashing repeatedly

2023-12-18 Thread Venky Shankar
Hi Thomas, On Wed, Dec 13, 2023 at 8:46 PM Thomas Widhalm wrote: > > Hi, > > I have a 18.2.0 Ceph cluster and my MDS are now crashing repeatedly. > After a few automatic restart, every MDS is removed and only one stays > active. But it's flagged "laggy" and I can't even start a scrub on it. > >

[ceph-users] Re: Ceph 17.2.7 to 18.2.0 issues

2023-12-06 Thread Venky Shankar
On Thu, Dec 7, 2023 at 12:49 PM Eugen Block wrote: > > Hi, did you unmount your clients after the cluster poweroff? If this is the case, then a remount would kick things back working. > You could > also enable debug logs in mds to see more information. Are there any > blocked requests? You can

[ceph-users] Re: MDS stuck in up:rejoin

2023-12-05 Thread Venky Shankar
things at a later point in time. As far your issue is concerned, is it possible to just throw away this fs and use a new one? > > Cheers, > Eric > > On 05/12/2023 06:10, Venky Shankar wrote: > > This email was sent to you by someone outside the University. > > You should

[ceph-users] Re: mds slow request with “failed to authpin, subtree is being exported"

2023-12-04 Thread Venky Shankar
On Tue, Dec 5, 2023 at 6:34 AM Xiubo Li wrote: > > > On 12/4/23 16:25, zxcs wrote: > > Thanks a lot, Xiubo! > > > > we already set ‘mds_bal_interval’ to 0. and the slow mds seems decrease. > > > > But somehow we still see mds complain slow request. and from mds log , can > > see > > > > “slow

[ceph-users] Re: MDS stuck in up:rejoin

2023-12-04 Thread Venky Shankar
Hi Eric, On Mon, Nov 27, 2023 at 8:00 PM Eric Tittley wrote: > > Hi all, > > For about a week our CephFS has experienced issues with its MDS. > > Currently the MDS is stuck in "up:rejoin" > > Issues become apparent when simple commands like "mv foo bar/" hung. I assume the MDS was active at

[ceph-users] Re: [ext] CephFS pool not releasing space after data deletion

2023-12-04 Thread Venky Shankar
Hi Mathias/Frank, (sorry for the late reply - this didn't get much attention including the tracker report and eventually got parked). Will have this looked into - expect an update in a day or two. On Sat, Dec 2, 2023 at 5:46 PM Frank Schilder wrote: > > Hi Mathias, > > have you made any

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-12-04 Thread Venky Shankar
Hi Yuri, On Fri, Dec 1, 2023 at 8:47 PM Yuri Weinstein wrote: > > Venky, pls review the test results for smoke and fs after the PRs were merged. fs run looks good. Summarized here https://tracker.ceph.com/projects/cephfs/wiki/Reef#04-Dec-2023 > > Radek, Igor, Adam - any updates on

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-27 Thread Venky Shankar
On Tue, Nov 21, 2023 at 10:35 PM Venky Shankar wrote: > > Hi Yuri, > > On Fri, Nov 10, 2023 at 1:22 PM Venky Shankar wrote: > > > > Hi Yuri, > > > > On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein wrote: > > > > > > I've updated all approvals

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-21 Thread Venky Shankar
Hi Yuri, On Fri, Nov 10, 2023 at 1:22 PM Venky Shankar wrote: > > Hi Yuri, > > On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein wrote: > > > > I've updated all approvals and merged PRs in the tracker and it looks > > like we are ready for gibba, LRC upgrades pending

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Venky Shankar
Hi Yuri, On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein wrote: > > I've updated all approvals and merged PRs in the tracker and it looks > like we are ready for gibba, LRC upgrades pending approval/update from > Venky. The smoke test failure is caused by missing (kclient) patches in Ubuntu

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Venky Shankar
Hi Yuri, On Wed, Nov 8, 2023 at 4:10 PM Venky Shankar wrote: > > Hi Yuri, > > On Wed, Nov 8, 2023 at 2:32 AM Yuri Weinstein wrote: > > > > 3 PRs above mentioned were merged and I am returning some tests: > > https://pulpito.ceph.com/?sha1=55e3239498650453ff76a9b0

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-08 Thread Venky Shankar
On Thu, Nov 9, 2023 at 3:53 AM Laura Flores wrote: > @Venky Shankar and @Patrick Donnelly > , I reviewed the smoke suite results and identified > a new bug: > > https://tracker.ceph.com/issues/63488 - smoke test fails from "NameError: > name 'DEBUGFS_META_DIR' is not de

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-08 Thread Venky Shankar
Hi Yuri, On Wed, Nov 8, 2023 at 2:32 AM Yuri Weinstein wrote: > > 3 PRs above mentioned were merged and I am returning some tests: > https://pulpito.ceph.com/?sha1=55e3239498650453ff76a9b06a37f1a6f488c8fd > > Still seeing approvals. > smoke - Laura, Radek, Prashant, Venky in progress > rados -

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-07 Thread Venky Shankar
On Tue, Nov 7, 2023 at 9:46 AM Venky Shankar wrote: > > Hi Yuri, > > On Tue, Nov 7, 2023 at 3:01 AM Yuri Weinstein wrote: > > > > Details of this release are summarized here: > > > > https://tracker.ceph.com/issues/63443#note-1 > > > > Seeking app

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-06 Thread Venky Shankar
Hi Yuri, On Tue, Nov 7, 2023 at 3:01 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/63443#note-1 > > Seeking approvals/reviews for: > > smoke - Laura, Radek, Prashant, Venky (POOL_APP_NOT_ENABLE failures) > rados - Neha, Radek,

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-20 Thread Venky Shankar
Hi Yuri, On Fri, Oct 20, 2023 at 9:44 AM Venky Shankar wrote: > > Hi Yuri, > > On Thu, Oct 19, 2023 at 10:48 PM Venky Shankar wrote: > > > > Hi Yuri, > > > > On Thu, Oct 19, 2023 at 9:32 PM Yuri Weinstein wrote: > > > > > > We are still fin

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-19 Thread Venky Shankar
Hi Yuri, On Thu, Oct 19, 2023 at 10:48 PM Venky Shankar wrote: > > Hi Yuri, > > On Thu, Oct 19, 2023 at 9:32 PM Yuri Weinstein wrote: > > > > We are still finishing off: > > > > - revert PR https://github.com/ceph/ceph/pull/54085, needs smoke suite rerun >

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-19 Thread Venky Shankar
/github.com/ceph/ceph/pull/53139 is causing a smoke test failure. Details: https://github.com/ceph/ceph/pull/53139#issuecomment-1771388202 I've sent a revert for that change - https://github.com/ceph/ceph/pull/54108 - will let you know when it's ready for testing. > > On Wed, Oct 18, 2023 at 9

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-18 Thread Venky Shankar
On Tue, Oct 17, 2023 at 12:23 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/63219#note-2 > Release Notes - TBD > > Issue https://tracker.ceph.com/issues/63192 appears to be failing several > runs. > Should it be fixed for this

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-17 Thread Venky Shankar
On Tue, Oct 17, 2023 at 12:23 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/63219#note-2 > Release Notes - TBD > > Issue https://tracker.ceph.com/issues/63192 appears to be failing several > runs. > Should it be fixed for this

[ceph-users] Re: Time Estimation for cephfs-data-scan scan_links

2023-10-12 Thread Venky Shankar
Hi Odair, On Thu, Oct 12, 2023 at 11:58 PM Odair M. wrote: > > Hello, > > I've encountered an issue where the metadata pool has corrupted a cache > inode, leading to an MDS rank abort in the 'reconnect' state. To address > this, I'm following the "USING AN ALTERNATE METADATA POOL FOR RECOVERY" >

[ceph-users] Re: Next quincy point release 17.2.7

2023-10-09 Thread Venky Shankar
FYI - Added 5 cephfs related PRs - those are under test in Yuri's branch. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs health warn

2023-10-03 Thread Venky Shankar
ull/52196 Suggest using single active mds or multimds with subtree pinning. > > Venky Shankar 于2023年10月3日周二 12:39写道: >> >> Hi Ben, >> >> Are you using multimds without subtree pinning? >> >> On Tue, Oct 3, 2023 at 10:00 AM Ben wrote: >> > >&g

[ceph-users] Re: cephfs health warn

2023-10-02 Thread Venky Shankar
Hi Ben, Are you using multimds without subtree pinning? On Tue, Oct 3, 2023 at 10:00 AM Ben wrote: > > Dear cephers: > more log captures(see below) show the full segments list(more than 3 to > be trimmed stuck, growing over time). any ideas to get out of this? > > Thanks, > Ben > > > debug

[ceph-users] Re: cephfs health warn

2023-09-27 Thread Venky Shankar
Hi Ben, On Tue, Sep 26, 2023 at 6:02 PM Ben wrote: > > Hi, > see below for details of warnings. > the cluster is running 17.2.5. the warnings have been around for a while. > one concern of mine is num_segments growing over time. Any config changes related to trimming that was done? A slow

[ceph-users] Re: Libcephfs : ceph_readdirplus_r() with ceph_ll_lookup_vino() : ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)

2023-09-22 Thread Venky Shankar
Hi Joseph, On Fri, Sep 22, 2023 at 5:27 PM Joseph Fernandes wrote: > > Hello All, > > I found a weird issue with ceph_readdirplus_r() when used along > with ceph_ll_lookup_vino(). > On ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy > (stable) > > Any help is really

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Venky Shankar
for this. In the meantime, could you share the debug logs stated in my previous email? On Wed, Sep 20, 2023 at 3:07 PM Venky Shankar wrote: > Hi Janek, > > On Tue, Sep 19, 2023 at 4:44 PM Janek Bevendorff < > janek.bevendo...@uni-weimar.de> wrote: > >> Hi Venky, >> >

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Venky Shankar
20 and reset via # ceph config rm mds.<> debug_mds > Janek > > > On 19/09/2023 12:36, Venky Shankar wrote: > > Hi Janek, > > On Mon, Sep 18, 2023 at 9:52 PM Janek Bevendorff < > janek.bevendo...@uni-weimar.de> wrote: > >> Thanks! However, I

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-19 Thread Venky Shankar
Hi Janek, On Mon, Sep 18, 2023 at 9:52 PM Janek Bevendorff < janek.bevendo...@uni-weimar.de> wrote: > Thanks! However, I still don't really understand why I am seeing this. > This is due to a changes that was merged recently in pacific https://github.com/ceph/ceph/pull/52270 The MDS

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-30 Thread Venky Shankar
a > > > > From: Adiga, Anantha > Sent: Monday, August 7, 2023 1:29 PM > To: Venky Shankar ; ceph-users@ceph.io > Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import > hung > > > > Hi Venky, > > > > Could this be the reason that

[ceph-users] Re: 16.2.14 pacific QE validation status

2023-08-25 Thread Venky Shankar
On Fri, Aug 25, 2023 at 7:17 AM Patrick Donnelly wrote: > > On Wed, Aug 23, 2023 at 10:41 AM Yuri Weinstein wrote: > > > > Details of this release are summarized here: > > > > https://tracker.ceph.com/issues/62527#note-1 > > Release Notes - TBD > > > > Seeking approvals for: > > > > smoke -

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
id as `mirror_remote` (since I guess these are the secondary clusters' conf given the names). > > Thank you, > Anantha > > -Original Message- > From: Venky Shankar > Sent: Monday, August 7, 2023 7:05 PM > To: Adiga, Anantha > Cc: ceph-users@ceph.io > Subje

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
) and then running ceph -c /path/to/secondary/ceph.conf --id <> status If that runs all fine, then the mirror daemon is probably hitting some bug. > These two clusters are configured for rgw multisite and is functional. > > Thank you, > Anantha > > -Original Message--

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
> > caps mon = "allow r fsname=cephfs" > > caps osd = "allow rw tag cephfs data=cephfs" > > root@a001s008-zz14l47008:/# > > root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create > cephfs client.mirror_remote shgR-site > > {"token": > "eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgIn

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Venky Shankar
gt; > root@fl31ca104ja0201:/# > > > > > > Thank you, > > Anantha > > From: Adiga, Anantha > Sent: Monday, August 7, 2023 11:21 AM > To: 'Venky Shankar' ; 'ceph-users@ceph.io' > > Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import > hu

[ceph-users] Re: MDS nodes blocklisted

2023-08-04 Thread Venky Shankar
Hi Nathan, On Mon, Jul 31, 2023 at 4:34 PM Nathan Harper wrote: > > Hi, > > We're having sporadic problems with a CephFS filesystem where MDSs end up > on the OSD blocklist. We're still digging around looking for a cause > (Ceph related or other infrastructure cause). The monitors can

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-04 Thread Venky Shankar
Hi Anantha, On Fri, Aug 4, 2023 at 2:27 AM Adiga, Anantha wrote: > > Hi > > Could you please provide guidance on how to diagnose this issue: > > In this case, there are two Ceph clusters: cluster A, 4 nodes and cluster B, > 3 node, in different locations. Both are already running RGW

[ceph-users] Re: ref v18.2.0 QE Validation status

2023-08-01 Thread Venky Shankar
Hi Yuri, On Tue, Aug 1, 2023 at 10:34 PM Venky Shankar wrote: > > On Tue, Aug 1, 2023 at 5:55 PM Venky Shankar wrote: > > > > On Tue, Aug 1, 2023 at 1:21 AM Yuri Weinstein wrote: > > > > > > Pls see the updated test results and Release Notes PR > >

[ceph-users] Re: ref v18.2.0 QE Validation status

2023-08-01 Thread Venky Shankar
On Tue, Aug 1, 2023 at 5:55 PM Venky Shankar wrote: > > On Tue, Aug 1, 2023 at 1:21 AM Yuri Weinstein wrote: > > > > Pls see the updated test results and Release Notes PR > > https://github.com/ceph/ceph/pull/52490 > > > > Still seeking approvals: >

[ceph-users] Re: ref v18.2.0 QE Validation status

2023-08-01 Thread Venky Shankar
On Tue, Aug 1, 2023 at 1:21 AM Yuri Weinstein wrote: > > Pls see the updated test results and Release Notes PR > https://github.com/ceph/ceph/pull/52490 > > Still seeking approvals: > smoke - Laura, Radek, Venky > rados - Radek, Laura, Nizamudeen > fs - Venky > orch - Adam King > powercycle -

[ceph-users] Re: mds terminated

2023-07-20 Thread Venky Shankar
On Thu, Jul 20, 2023 at 11:19 PM wrote: > > If any rook-ceph users see the situation that mds is stuck in replay, then > look at the logs of the mds pod. > > When it runs and then terminates repeatedly, check if there is "liveness > probe termninated" error message by typing "kubectl describe

[ceph-users] Re: MDS Upgrade from 17.2.5 to 17.2.6 not possible

2023-05-17 Thread Venky Shankar
Hi Henning, On Wed, May 17, 2023 at 9:25 PM Henning Achterrath wrote: > > Hi all, > > we did a major update from Pacific to Quincy (17.2.5) a month ago > without any problems. > > Now we have tried a minor update from 17.2.5 to 17.2.6 (ceph orch > upgrade). It stucks at mds upgrade phase. At

[ceph-users] Re: Telemetry and Redmine sync

2023-05-15 Thread Venky Shankar
Hi Yaarit, On Fri, May 12, 2023 at 7:23 PM Yaarit Hatuka wrote: > > Hi everyone, > > Over this weekend we will run a sync between telemetry crashes and Redmine > tracker issues. > This might affect your inbox, depending on your Redmine email notification > setup. You can set up filters for these

[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-08 Thread Venky Shankar
; and rerun appropriate suites. > > > > > > On Thu, May 4, 2023 at 9:07 AM Radoslaw Zarzynski > > > wrote: > > > > > > > > If we get some time, I would like to include: > > > > > > > > https://github.com/ceph/ceph/pull/50894. > > >

[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-04 Thread Venky Shankar
Hi Yuri, On Wed, May 3, 2023 at 7:10 PM Venky Shankar wrote: > > On Tue, May 2, 2023 at 8:25 PM Yuri Weinstein wrote: > > > > Venky, I did plan to cherry-pick this PR if you approve this (this PR > > was used for a rerun) > > OK. The fs suite failure

[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Venky Shankar
On Tue, May 2, 2023 at 8:25 PM Yuri Weinstein wrote: > > Venky, I did plan to cherry-pick this PR if you approve this (this PR > was used for a rerun) OK. The fs suite failure is being looked into (https://tracker.ceph.com/issues/59626). > > On Tue, May 2, 2023 at 7:51 AM Venky

[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-02 Thread Venky Shankar
Hi Yuri, On Fri, Apr 28, 2023 at 2:53 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/59542#note-1 > Release Notes - TBD > > Seeking approvals for: > > smoke - Radek, Laura > rados - Radek, Laura > rook - Sébastien Han > cephadm -

[ceph-users] Re: cephfs - max snapshot limit?

2023-05-01 Thread Venky Shankar
Hi Arnaud, On Fri, Apr 28, 2023 at 2:16 PM MARTEL Arnaud wrote: > > Hi Venky, > > > Also, at one point the kclient wasn't able to handle more than 400 > > snapshots (per file system), but we have come a long way from that and that > > is not a constraint right now. > Does it mean that there is

[ceph-users] Re: cephfs - max snapshot limit?

2023-04-28 Thread Venky Shankar
Hi Tobias, On Thu, Apr 27, 2023 at 2:42 PM Tobias Hachmer wrote: > > Hi sur5r, > > Am 4/27/23 um 10:33 schrieb Jakob Haufe: > > On Thu, 27 Apr 2023 09:07:10 +0200 > > Tobias Hachmer wrote: > > > >> But we observed that max 50 snapshot are preserved. If a new snapshot is > >> created the

[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-19 Thread Venky Shankar
Hi Reto, On Wed, Apr 19, 2023 at 9:34 PM Ilya Dryomov wrote: > > On Wed, Apr 19, 2023 at 5:57 PM Reto Gysi wrote: > > > > > > Hi, > > > > Am Mi., 19. Apr. 2023 um 11:02 Uhr schrieb Ilya Dryomov > > : > >> > >> On Wed, Apr 19, 2023 at 10:29 AM Reto Gysi wrote: > >> > > >> > yes, I used the

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-27 Thread Venky Shankar
On Sat, Mar 25, 2023 at 1:17 AM Yuri Weinstein wrote: > > Details of this release are updated here: > > https://tracker.ceph.com/issues/59070#note-1 > Release Notes - TBD > > The slowness we experienced seemed to be self-cured. > Neha, Radek, and Laura please provide any findings if you have

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-22 Thread Venky Shankar
On Wed, Mar 22, 2023 at 1:36 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/59070#note-1 > Release Notes - TBD > > The reruns were in the queue for 4 days because of some slowness issues. > The core team (Neha, Radek, Laura, and

[ceph-users] Re: Does cephfs subvolume have commands similar to `rbd perf` to query iops, bandwidth, and latency of rbd image?

2023-02-13 Thread Venky Shankar
On Tue, Feb 14, 2023 at 12:05 AM 郑亮 wrote: > > Hi all, > > Does cephfs subvolume have commands similar to rbd perf to query iops, > bandwidth, and latency of rbd image? `ceph fs perf stats` shows metrics of > the client side, not the metrics of the cephfs subvolume. What I want to > get is the

[ceph-users] Re: ceph-fuse in infinite loop reading objects without client requests

2023-02-07 Thread Venky Shankar
Hi Andras, On Sat, Feb 4, 2023 at 1:59 AM Andras Pataki wrote: > > We've been running into a strange issue with ceph-fuse on some nodes > lately. After some job runs on the node (and finishes or gets killed), > ceph-fuse gets stuck busy requesting objects from the OSDs without any > processes

[ceph-users] Re: Mds crash at cscs

2023-01-24 Thread Venky Shankar
On Thu, Jan 19, 2023 at 9:07 PM Lo Re Giuseppe wrote: > > Dear all, > > We have started to use more intensively cephfs for some wlcg related workload. > We have 3 active mds instances spread on 3 servers, > mds_cache_memory_limit=12G, most of the other configs are default ones. > One of them has

[ceph-users] Re: 16.2.11 pacific QE validation status

2023-01-24 Thread Venky Shankar
On Mon, Jan 23, 2023 at 11:22 PM Yuri Weinstein wrote: > > Ilya, Venky > > rbd, krbd, fs reruns are almost ready, pls review/approve fs approved. > > On Mon, Jan 23, 2023 at 2:30 AM Ilya Dryomov wrote: > > > > On Fri, Jan 20, 2023 at 5:38 PM Yuri Weinstein wrote: > > > > > > The overall

[ceph-users] Re: MDS stuck in "up:replay"

2023-01-23 Thread Venky Shankar
On Tue, Jan 24, 2023 at 1:34 AM wrote: > > Hello Thomas, > > I have same issue with mds like you describe, and ceph version is a same. Did > state up:replay ever finish in your case? There is probably much going on with Thomas's cluster which is blocking the mds to make progress. Could you

[ceph-users] Re: 16.2.11 pacific QE validation status

2023-01-23 Thread Venky Shankar
Hey Yuri, On Fri, Jan 20, 2023 at 10:08 PM Yuri Weinstein wrote: > > The overall progress on this release is looking much better and if we > can approve it we can plan to publish it early next week. > > Still seeking approvals > > rados - Neha, Laura > rook - Sébastien Han > cephadm - Adam >

[ceph-users] Re: MDS stuck in "up:replay"

2023-01-19 Thread Venky Shankar
y seq 241 addr > [v2:192.168.23.66:6810/2868317045,v1:192.168.23.66:6811/2868317045] > compat {c=[1],r=[1],i=[7ff]}] > > > Standby daemons: > > [mds.mds01.ceph05.pqxmvt{-1:61834887} state up:standby seq 1 addr > [v2:192.168.23.65:6800/957802673,v1:192.168.23.65:6801/9578026

[ceph-users] Re: MDS stuck in "up:replay"

2023-01-19 Thread Venky Shankar
Hi Thomas, On Tue, Jan 17, 2023 at 5:34 PM Thomas Widhalm wrote: > > Another new thing that just happened: > > One of the MDS just crashed out of nowhere. > >

[ceph-users] Re: 17.2.5 ceph fs status: AssertionError

2023-01-19 Thread Venky Shankar
Hi Robert, On Wed, Jan 18, 2023 at 2:43 PM Robert Sander wrote: > > Hi, > > I have a healthy (test) cluster running 17.2.5: > > root@cephtest20:~# ceph status >cluster: > id: ba37db20-2b13-11eb-b8a9-871ba11409f6 > health: HEALTH_OK > >services: > mon: 3

[ceph-users] Re: MDS error

2023-01-14 Thread Venky Shankar
Hi André, On Sat, Jan 14, 2023 at 12:14 AM André de Freitas Smaira wrote: > > Hello! > > Yesterday we found some errors in our cephadm disks, which is making it > impossible to access our HPC Cluster: > > # ceph health detail > HEALTH_WARN 3 failed cephadm daemon(s); insufficient standby MDS

[ceph-users] Re: MDS crashes to damaged metadata

2023-01-08 Thread Venky Shankar
Hi Felix, On Thu, Dec 15, 2022 at 8:03 PM Stolte, Felix wrote: > > Hi Patrick, > > we used your script to repair the damaged objects on the weekend and it went > smoothly. Thanks for your support. > > We adjusted your script to scan for damaged files on a daily basis, runtime > is about 6h.

[ceph-users] Re: Cannot create CephFS subvolume

2023-01-02 Thread Venky Shankar
Hi Daniel, On Wed, Dec 28, 2022 at 3:17 AM Daniel Kovacs wrote: > > Hello! > > I'd like to create a CephFS subvol, with these command: ceph fs > subvolume create cephfs_ssd subvol_1 > I got this error: Error EINVAL: invalid value specified for > ceph.dir.subvolume > If I use another cephfs

[ceph-users] Re: CephFS: Isolating folders for different users

2023-01-02 Thread Venky Shankar
Hi Jonas, On Mon, Jan 2, 2023 at 10:52 PM Jonas Schwab wrote: > > Thank you very much! Works like a charm, except for one thing: I gave my > clients the MDS caps 'allow rws path=' to also be able > to create snapshots from the client, but `mkdir .snap/test` still returns > mkdir: cannot

[ceph-users] Re: 16.2.11 pacific QE validation status

2022-12-19 Thread Venky Shankar
On Thu, Dec 15, 2022 at 10:45 PM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/58257#note-1 > Release Notes - TBD > > Seeking approvals for: > > rados - Neha (https://github.com/ceph/ceph/pull/49431 is still being > tested and will be

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-16 Thread Venky Shankar
On Fri, Dec 16, 2022 at 1:27 PM Holger Naundorf wrote: > > > > On 15.12.22 14:06, Venky Shankar wrote: > > Hi Holger, > > > > (sorry for the late reply) > > > > On Fri, Dec 9, 2022 at 6:22 PM Holger Naundorf > > wrote: > >> > >>

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-15 Thread Venky Shankar
15:53, Holger Naundorf wrote: > > On 06.12.22 14:17, Venky Shankar wrote: > >> On Tue, Dec 6, 2022 at 6:34 PM Holger Naundorf > >> wrote: > >>> > >>> > >>> > >>> On 06.12.22 09:54, Venky Shankar wrote: > >>>>

[ceph-users] Re: MDS_DAMAGE dir_frag

2022-12-14 Thread Venky Shankar
Hi Sascha, On Tue, Dec 13, 2022 at 6:43 PM Sascha Lucas wrote: > > Hi, > > On Mon, 12 Dec 2022, Sascha Lucas wrote: > > > On Mon, 12 Dec 2022, Gregory Farnum wrote: > > >> Yes, we’d very much like to understand this. What versions of the server > >> and kernel client are you using? What platform

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-06 Thread Venky Shankar
On Tue, Dec 6, 2022 at 6:34 PM Holger Naundorf wrote: > > > > On 06.12.22 09:54, Venky Shankar wrote: > > Hi Holger, > > > > On Tue, Dec 6, 2022 at 1:42 PM Holger Naundorf > > wrote: > >> > >> Hello, > >> we have set up a snap-mi

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-06 Thread Venky Shankar
Hi Holger, On Tue, Dec 6, 2022 at 1:42 PM Holger Naundorf wrote: > > Hello, > we have set up a snap-mirror for a directory on one of our clusters - > running ceph version > > ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific > (stable) > > to get mirrorred our other cluster

[ceph-users] Re: MDS stuck ops

2022-11-29 Thread Venky Shankar
h, octopus and pacific. I'm not sure why that happened, especially when there are no explicit pins set for sub-directories. Maybe Patrick has an explanation. > > It would be great if you could help me out here. Maybe it really is just > terminology? > > Thanks a lot for your time ag

[ceph-users] Re: MDS stuck ops

2022-11-29 Thread Venky Shankar
s are fine - quincy docs do mention that the directory fragments are distributed while the octopus docs do not. I agree, the wordings are a bit subtle. > > Thanks for any insight! > > Best regards, > ========= > Frank Schilder > AIT Risø Campus > Bygning 109, rum

[ceph-users] Re: MDS stuck ops

2022-11-29 Thread Venky Shankar
On Tue, Nov 29, 2022 at 1:42 PM Frank Schilder wrote: > > Hi Venky. > > > You most likely ran into performance issues with distributed ephemeral > > pins with octopus. It'd be nice to try out one of the latest releases > > for this. > > I run into the problem that distributed ephemeral pinning

[ceph-users] Re: MDS stuck ops

2022-11-28 Thread Venky Shankar
formance issues with distributed ephemeral pins with octopus. It'd be nice to try out one of the latest releases for this. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > Fro

[ceph-users] Re: MDS stuck ops

2022-11-28 Thread Venky Shankar
DS internal op > exportdir despite ephemeral pinning". Since I pinned everything all problems > are gone and performance is boosted. We are also on octopus. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > >

[ceph-users] Re: MDS stuck ops

2022-11-28 Thread Venky Shankar
On Mon, Nov 28, 2022 at 10:19 PM Reed Dier wrote: > > Hopefully someone will be able to point me in the right direction here: > > Cluster is Octopus/15.2.17 on Ubuntu 20.04. > All are kernel cephfs clients, either 5.4.0-131-generic or 5.15.0-52-generic. > Cluster is nearful, and more storage is

[ceph-users] Re: CephFS Snapshot Mirroring slow due to repeating attribute sync

2022-11-28 Thread Venky Shankar
Hi Mathias, (apologies for the super late reply - I was getting back from a long vacation and missed seeing this). I updated the tracker ticket. Let's move the discussion there... On Mon, Nov 28, 2022 at 7:46 PM Venky Shankar wrote: > > On Tue, Aug 23, 2022 at 10:01 PM Kuhring, M

[ceph-users] Re: CephFS Snapshot Mirroring slow due to repeating attribute sync

2022-11-28 Thread Venky Shankar
On Tue, Aug 23, 2022 at 10:01 PM Kuhring, Mathias wrote: > > Dear Ceph developers and users, > > We are using ceph version 17.2.1 > (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable). > We are using cephadm since version 15 octopus. > > We mirror several CephFS directories from our main

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2022-11-10 Thread Venky Shankar
p once a second to a file or > sequential files but is there some tool or convention that is easy to look at > and analyze? Not really - you'd have to do it yourself. > > Tnx, > --- > Olli Rajala - Lead TD > Anima Vitae Ltd. > www.anima.fi > --

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2022-11-09 Thread Venky Shankar
Hi Olli, On Mon, Oct 17, 2022 at 1:08 PM Olli Rajala wrote: > > Hi Patrick, > > With "objecter_ops" did you mean "ceph tell mds.pve-core-1 ops" and/or > "ceph tell mds.pve-core-1 objecter_requests"? Both these show very few > requests/ops - many times just returning empty lists. I'm pretty sure

[ceph-users] Re: quincy v17.2.4 QE Validation status

2022-09-17 Thread Venky Shankar
On Wed, Sep 14, 2022 at 1:33 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/57472#note-1 > Release Notes - https://github.com/ceph/ceph/pull/48072 > > Seeking approvals for: > > rados - Neha, Travis, Ernesto, Adam > rgw - Casey > fs -

[ceph-users] Re: The next quincy point release

2022-09-01 Thread Venky Shankar
On Tue, Aug 30, 2022 at 10:48 PM Yuri Weinstein wrote: > > I have several PRs in testing: > > https://github.com/ceph/ceph/labels/wip-yuri2-testing > https://github.com/ceph/ceph/labels/wip-yuri-testing (needs fs review) > > Assuming they were merged, anything else is a must to be added to the >

  1   2   >