Dear CephFSers.

We are running ceph/cephfs in 10.2.2. All infrastructure is in the same version (rados cluster, mons, mds and cephfs clients). We mount cephfs using ceph-fuse.

Last week I triggered some of my heavy users to delete data. In the following example, the user in question decreased his usage from ~4.5TB to ~ 600TB. However, some clients still did not update the usage (although several days have passed by) while others are ok.

From a point of view of the MDS, both types of client have healthy sessions. See detailed info after this email.

Trying to kick the session does not solve the issue. Probably only a remount but users are heavily using the filesystem and I do not want to break things for them now.


The only difference I can actually dig out between "good"/"bad" clients is that the user continues with active bash sessions in the "bad" client (from where he triggered the deletions)

   # lsof | grep user1 | grep ceph
   bash      15737   user1  cwd       DIR               0,24
   5285584388909 1099514070586 /coepp/cephfs/mel/user1
   vim       19233   user1  cwd       DIR               0,24 24521126
   1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts
   vim       19233   user1    5u      REG 0,24         16384
   1099557935412
   /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts/.histmgr.py.swp
   bash      24187   user1  cwd       DIR               0,24 826758558
   1099514314315 /coepp/cephfs/mel/user1/Analysis
   bash      24256   user1  cwd       DIR               0,24 147600
   1099514340621 /coepp/cephfs/mel/user1/Analysis/ssdilep/run
   bash      24327   user1  cwd       DIR               0,24 151068
   1099514340590 /coepp/cephfs/mel/user1/Analysis/ssdilep/algs
   bash      24394   user1  cwd       DIR               0,24 151068
   1099514340590 /coepp/cephfs/mel/user1/Analysis/ssdilep/algs
   bash      24461   user1  cwd       DIR               0,24 356436
   1099514340614 /coepp/cephfs/mel/user1/Analysis/ssdilep/samples
   bash      24528   user1  cwd       DIR               0,24 24521126
   1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts
   bash      24601   user1  cwd       DIR               0,24 24521126
   1099514340633 /coepp/cephfs/mel/user1/Analysis/ssdilep/scripts

Is there a particular way to force the client to update these info? Do we actually know what it is taking so so long to update it?

Cheers

Goncalo

--- * ---


1) Reports from a client which shows "obsolete" file/directory sizes:

   # ll -h /coepp/cephfs/mel/ | grep user1
   *drwxr-xr-x 1 user1      coepp_mel 4.9T Oct  7 00:20 user1*

   # getfattr -d -m ceph /coepp/cephfs/mel/user1
   getfattr: Removing leading '/' from absolute path names
   # file: coepp/cephfs/mel/user1
   ceph.dir.entries="10"
   ceph.dir.files="1"
   *ceph.dir.rbytes="5285584388909"*
   ceph.dir.rctime="1480390891.09882864298"
   ceph.dir.rentries="161047"
   ceph.dir.rfiles="149669"
   ceph.dir.rsubdirs="11378"
   ceph.dir.subdirs="9"

   ---> Running following command in the client:
   # ceph daemon /var/run/ceph/ceph-client.mount_user.asok mds_sessions
   {
        "id": 616794,
        "sessions": [
            {
                "mds": 0,
                "addr": "<MDS IP>:6800\/1457",
                "seq": 4884237,
                "cap_gen": 0,
                "cap_ttl": "2016-12-04 22:45:53.046697",
                "last_cap_renew_request": "2016-12-04 22:44:53.046697",
                "cap_renew_seq": 166765,
                "num_caps": 1567318,
                "state": "open"
            }
        ],
        "mdsmap_epoch": 5224
   }

   ---> Running the following command in the mds:
   # ceph daemon mds.rccephmds session ls
   (...)

       {
            "id": 616794,
            "num_leases": 0,
            "num_caps": 21224,
            "state": "open",
            "replay_requests": 0,
            "completed_requests": 0,
            "reconnecting": false,
            "inst": "client.616794 <BAD CLIENT IP>:0\/68088301",
            "client_metadata": {
                "ceph_sha1": "45107e21c568dd033c2f0a3107dec8f0b0e58374",
                "ceph_version": "ceph version 10.2.2
   (45107e21c568dd033c2f0a3107dec8f0b0e58374)",
                "entity_id": "mount_user",
                "hostname": "badclient.my.domain",
                "mount_point": "\/coepp\/cephfs",
                "root": "\/cephfs"
            }
        },

2) Reports from a client which shows "good" file/directory sizes:

   # ll -h /coepp/cephfs/mel/ | grep user1
   drwxr-xr-x 1 user1      coepp_mel 576G Oct  7 00:20 user1

   # getfattr -d -m ceph /coepp/cephfs/mel/user1
   getfattr: Removing leading '/' from absolute path names
   # file: coepp/cephfs/mel/user1
   ceph.dir.entries="10"
   ceph.dir.files="1"
   *ceph.dir.rbytes="617756983774"*
   ceph.dir.rctime="1480844101.09560671770"
   ceph.dir.rentries="96519"
   ceph.dir.rfiles="95091"
   ceph.dir.rsubdirs="1428"
   ceph.dir.subdirs="9"

   ---> Running following command in the client:
   # ceph daemon /var/run/ceph/ceph-client.mount_user.asok mds_sessions
   {
        "id": 616338,
        "sessions": [
            {
                "mds": 0,
                "addr": "<MDS IP>:6800\/1457",
                "seq": 7851161,
                "cap_gen": 0,
                "cap_ttl": "2016-12-04 23:32:30.041978",
                "last_cap_renew_request": "2016-12-04 23:31:30.041978",
                "cap_renew_seq": 169143,
                "num_caps": 311386,
                "state": "open"
            }
        ],
        "mdsmap_epoch": 5224
   }


    ---> Running following command in the mds:

        {
            "id": 616338,
            "num_leases": 0,
            "num_caps": 16078,
            "state": "open",
            "replay_requests": 0,
            "completed_requests": 0,
            "reconnecting": false,
            "inst": "client.616338 <GOOD CLIENT IP>:0\/3807825927",
            "client_metadata": {
                "ceph_sha1": "45107e21c568dd033c2f0a3107dec8f0b0e58374",
                "ceph_version": "ceph version 10.2.2
   (45107e21c568dd033c2f0a3107dec8f0b0e58374)",
                "entity_id": "mount_user",
                "hostname": "goodclient.my.domain",
                "mount_point": "\/coepp\/cephfs",
                "root": "\/cephfs"
            }
        },


--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW  2006
T: +61 2 93511937

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to