[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

Dan van der Ster Tue, 31 Aug 2021 06:46:57 -0700

Hi,

I don't know how to find a full path from a dir object.
But perhaps you can make an educated guess based on what you see in:


rados listomapkeys --pool=con-fs2-meta1 1000eec35f5.01000000 | head -n 100

Those should be the directory entries. (s/_head//)

-- Dan

On Tue, Aug 31, 2021 at 2:31 PM Frank Schilder <fr...@dtu.dk> wrote:
>
> Dear Dan and Patrick,
>
> the find didn't return anything. With this and the info below, am I right to 
> assume that these were temporary working directories that got caught in a 
> snapshot (we use rolling snapshots)?
>
> I would really appreciate any ideas on how to find out the original file 
> system path of these large directories. I would like to advise the user(s) 
> that we have a special high-performance file system for temporary data.
>
> I can't find indications of performance problems with the meta-data pool. 
> After the re-deployment of OSDs with quadrupling the OSD count, the meta data 
> pool seems to perform very well. The find did run over a 1.3PB file system in 
> under 18hours.
>
> However, running this find on the root got me caught in another problem: 
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/HKEBXXRMX5WA5Y6JFM34WFPMWTCMPFCG/#EMHNSHZIPFZZ5QYS6B4VW3LUGL6HDTOP
>
> Apparently, the meta data performance is now so high that a single client can 
> crash an MDS daemon and even take the MDS cluster with it.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder
> Sent: 30 August 2021 16:18:02
> To: ceph-users
> Cc: Dan van der Ster; Patrick Donnelly
> Subject: Re: [ceph-users] LARGE_OMAP_OBJECTS: any proper action possible?
>
> Dear Dan and Patrick,
>
> I have the suspicion that I'm looking at large directories in the snapshots 
> that do no longer exist any more on the file system. Hence, the omap objects 
> are not fragmented as explained in the tracker issue. Here is the info as you 
> asked me to pull out:
>
> > find /cephfs -type d -inum 1099738108263
>
> The find didn't return yet. Would be great to find which user is doing that. 
> Unfortunately, I don't believe the directory still exists.
>
> > rados -p cephfs_metadata listomapkeys 1000d7fd167.02800000
>
> I did this on a different object:
>
> # rados listomapkeys --pool=con-fs2-meta1 1000eec35f5.01000000 | wc -l
> 216000
>
> This matches with the log message. I guess these keys are file/dir names? 
> Then yes, its a huge directory.
>
> > Please try the resolutions suggested in: 
> > https://tracker.ceph.com/issues/45333
>
> If I understand correctly, the INODE.00000000 objects contain the path 
> information:
>
> [root@gnosis ~]# rados listxattr --pool=con-fs2-meta1 1000eec35f5.01000000
> [root@gnosis ~]# rados listxattr --pool=con-fs2-meta1 1000eec35f5.00000000
> layout
> parent
>
> Decoding the meta info in the parent attribute gives:
>
> [root@gnosis ~]# rados getxattr --pool=con-fs2-meta1 1000eec35f5.00000000 
> parent | ceph-dencoder type inode_backtrace_t import - decode dump_json
> {
>     "ino": 1099761989109,
>     "ancestors": [
>         {
>             "dirino": 1552,
>             "dname": "1000eec35f5",
>             "version": 882614706
>         },
>         {
>             "dirino": 257,
>             "dname": "stray6",
>             "version": 563853824
>         }
>     ],
>     "pool": 12,
>     "old_pools": []
> }
>
> This smells a lot like a deleted directory in a snapshot, moved to one of the 
> stray object bucket. The result is essentially the same for all large omap 
> objects except for the stray number. Is it possible to figure out the 
> original location in the file system path?
>
> I guess I have to increase the warning threshold or live with the warning 
> message, neither of which is preferred. It would be great if you could help 
> me find the original path so I can identify the user and advice him/her on 
> how to organise his/her files.
>
> Thanks and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Patrick Donnelly <pdonn...@redhat.com>
> Sent: 27 August 2021 19:16:16
> To: Frank Schilder
> Cc: ceph-users
> Subject: Re: [ceph-users] LARGE_OMAP_OBJECTS: any proper action possible?
>
> Hi Frank,
>
> On Wed, Aug 25, 2021 at 6:27 AM Frank Schilder <fr...@dtu.dk> wrote:
> >
> > Hi all,
> >
> > I have the notorious "LARGE_OMAP_OBJECTS: 4 large omap objects" warning and 
> > am again wondering if there is any proper action one can take except "wait 
> > it out and deep-scrub (numerous ceph-users threads)" or "ignore 
> > (https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects)".
> >  Only for RGWs is a proper action described, but mine come from MDSes. Is 
> > there any way to ask an MDS to clean up or split the objects?
> >
> > The disks with the meta-data pool can easily deal with objects of this 
> > size. My question is more along the lines: If I can't do anything anyway, 
> > why the warning? If there is a warning, I would assume that one can do 
> > something proper to prevent large omap objects from being born by an MDS. 
> > What is it?
>
> Please try the resolutions suggested in: https://tracker.ceph.com/issues/45333
>
> --
> Patrick Donnelly, Ph.D.
> He / Him / His
> Principal Software Engineer
> Red Hat Sunnyvale, CA
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

Reply via email to