Re: [ceph-users] Problem with CephFS - No space left on device
Hi Yoann, thanks a lot for your help. root@pf-us1-dfs3:/home/rodrigo# ceph osd crush tree ID CLASS WEIGHT TYPE NAME -1 72.77390 root default -3 29.10956 host pf-us1-dfs1 0 hdd 7.27739 osd.0 5 hdd 7.27739 osd.5 6 hdd 7.27739 osd.6 8 hdd 7.27739 osd.8 -5 29.10956 host pf-us1-dfs2 1 hdd 7.27739 osd.1 3 hdd 7.27739 osd.3 7 hdd 7.27739 osd.7 9 hdd 7.27739 osd.9 -7 14.55478 host pf-us1-dfs3 2 hdd 7.27739 osd.2 4 hdd 7.27739 osd.4 root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule ls replicated_rule root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] On Tue, Jan 8, 2019 at 11:35 AM Yoann Moulin wrote: > Hello, > > > Hi Yoann, thanks for your response. > > Here are the results of the commands. > > > > root@pf-us1-dfs2:/var/log/ceph# ceph osd df > > ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS > > 0 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310 > > 5 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271 > > 6 hdd 7.27739 1.0 7.3 TiB 609 GiB 6.7 TiB 8.17 0.15 49 > > 8 hdd 7.27739 1.0 7.3 TiB 2.5 GiB 7.3 TiB 0.030 42 > > 1 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285 > > 3 hdd 7.27739 1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296 > > 7 hdd 7.27739 1.0 7.3 TiB 360 GiB 6.9 TiB 4.84 0.09 53 > > 9 hdd 7.27739 1.0 7.3 TiB 4.1 GiB 7.3 TiB 0.06 0.00 38 > > 2 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321 > > 4 hdd 7.27739 1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351 > >TOTAL 73 TiB 39 TiB 34 TiB 53.13 > > MIN/MAX VAR: 0/1.79 STDDEV: 41.15 > > It looks like you don't have a good balance between your OSD, what is your > failure domain ? > > could you provide your crush map > http://docs.ceph.com/docs/luminous/rados/operations/crush-map/ > > ceph osd crush tree > ceph osd crush rule ls > ceph osd crush rule dump > > > > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail > > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla > > gs hashpspool,full stripe_width 0 > > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf > > or 0/439 flags hashpspool,full stripe_width 0 application cephfs > > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 256 pgp_num 256 last_change 47 > > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs > > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash > rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha > > shpspool,full stripe_width 0 application rgw > > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 47 > > 1 flags hashpspool,full stripe_width 0 application rgw > > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f > > lags hashpspool,full stripe_width 0 application rgw > > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl > > ags hashpspool,full stripe_width 0 application rgw > > You may need to increase the pg num for cephfs_data pool. But before, you > must understand what is the impact https://ceph.com/pgcalc/ > you can't decrease pg_num, if it set too high you may have trouble in your > cluster. > > > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree > > ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF > > -1 72.77390 root default > > -3 29.10956 host pf-us1-dfs1 > > 0 hdd 7.27739 osd.0up 1.0 1.0 > > 5 hdd 7.27739 osd.5up 1.0 1.0 > > 6 hdd 7.27739 osd.6up 1.0 1.0 > > 8 hdd 7.27739 osd.8up 1.0 1.0 > > -5 29.10956 host pf-us1-dfs2 > > 1 hdd 7.27739 osd.1up 1.0 1.0 > > 3 hdd 7.27739 osd.3up 1.0 1.0 > > 7 hdd 7.27739 osd.7up 1.0 1.0 > > 9 hdd 7.27739 osd.9up 1.0 1.0 > > -7 14.55478 host pf-us1-dfs3 > > 2 hdd 7.27739 osd.2up 1.0
Re: [ceph-users] Problem with CephFS - No space left on device
Thanks again Kevin. If I reduce the size flag to a value of 2, that should fix the problem? Regards On Tue, Jan 8, 2019 at 11:28 AM Kevin Olbrich wrote: > You use replication 3 failure-domain host. > OSD 2 and 4 are full, thats why your pool is also full. > You need to add two disks to pf-us1-dfs3 or swap one from the larger > nodes to this one. > > Kevin > > Am Di., 8. Jan. 2019 um 15:20 Uhr schrieb Rodrigo Embeita > : > > > > Hi Yoann, thanks for your response. > > Here are the results of the commands. > > > > root@pf-us1-dfs2:/var/log/ceph# ceph osd df > > ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS > > 0 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310 > > 5 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271 > > 6 hdd 7.27739 1.0 7.3 TiB 609 GiB 6.7 TiB 8.17 0.15 49 > > 8 hdd 7.27739 1.0 7.3 TiB 2.5 GiB 7.3 TiB 0.030 42 > > 1 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285 > > 3 hdd 7.27739 1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296 > > 7 hdd 7.27739 1.0 7.3 TiB 360 GiB 6.9 TiB 4.84 0.09 53 > > 9 hdd 7.27739 1.0 7.3 TiB 4.1 GiB 7.3 TiB 0.06 0.00 38 > > 2 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321 > > 4 hdd 7.27739 1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351 > >TOTAL 73 TiB 39 TiB 34 TiB 53.13 > > MIN/MAX VAR: 0/1.79 STDDEV: 41.15 > > > > > > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail > > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla > > gs hashpspool,full stripe_width 0 > > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf > > or 0/439 flags hashpspool,full stripe_width 0 application cephfs > > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 256 pgp_num 256 last_change 47 > > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs > > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash > rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha > > shpspool,full stripe_width 0 application rgw > > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 47 > > 1 flags hashpspool,full stripe_width 0 application rgw > > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f > > lags hashpspool,full stripe_width 0 application rgw > > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl > > ags hashpspool,full stripe_width 0 application rgw > > > > > > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree > > ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF > > -1 72.77390 root default > > -3 29.10956 host pf-us1-dfs1 > > 0 hdd 7.27739 osd.0up 1.0 1.0 > > 5 hdd 7.27739 osd.5up 1.0 1.0 > > 6 hdd 7.27739 osd.6up 1.0 1.0 > > 8 hdd 7.27739 osd.8up 1.0 1.0 > > -5 29.10956 host pf-us1-dfs2 > > 1 hdd 7.27739 osd.1up 1.0 1.0 > > 3 hdd 7.27739 osd.3up 1.0 1.0 > > 7 hdd 7.27739 osd.7up 1.0 1.0 > > 9 hdd 7.27739 osd.9up 1.0 1.0 > > -7 14.55478 host pf-us1-dfs3 > > 2 hdd 7.27739 osd.2up 1.0 1.0 > > 4 hdd 7.27739 osd.4up 1.0 1.0 > > > > > > Thanks for your help guys. > > > > > > On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin > wrote: > >> > >> Hello, > >> > >> > Hi guys, I need your help. > >> > I'm new with Cephfs and we started using it as file storage. > >> > Today we are getting no space left on device but I'm seeing that we > have plenty space on the filesystem. > >> > Filesystem Size Used Avail Use% Mounted on > >> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts > 73T 39T 35T 54% /mnt/cephfs > >> > > >> > We have 35TB of disk space. I've added 2 additional OSD disks with > 7
Re: [ceph-users] Problem with CephFS - No space left on device
Hi Kevin, thanks for your answer. How Can I check the (re-)weights? On Tue, Jan 8, 2019 at 10:36 AM Kevin Olbrich wrote: > Looks like the same problem like mine: > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032054.html > > The free space is total while Ceph uses the smallest free space (worst > OSD). > Please check your (re-)weights. > > Kevin > > Am Di., 8. Jan. 2019 um 14:32 Uhr schrieb Rodrigo Embeita > : > > > > Hi guys, I need your help. > > I'm new with Cephfs and we started using it as file storage. > > Today we are getting no space left on device but I'm seeing that we have > plenty space on the filesystem. > > Filesystem Size Used Avail Use% Mounted on > > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts > 73T 39T 35T 54% /mnt/cephfs > > > > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB > each but I'm getting the error "No space left on device" every time that I > want to add a new file. > > After adding the 2 additional OSD disks I'm seeing that the load is > beign distributed among the cluster. > > Please I need your help. > > > > root@pf-us1-dfs1:/etc/ceph# ceph -s > > cluster: > >id: 609e9313-bdd3-449e-a23f-3db8382e71fb > >health: HEALTH_ERR > >2 backfillfull osd(s) > >1 full osd(s) > >7 pool(s) full > >197313040/508449063 objects misplaced (38.807%) > >Degraded data redundancy: 2/508449063 objects degraded > (0.000%), 2 pgs degraded > >Degraded data redundancy (low space): 16 pgs > backfill_toofull, 3 pgs recovery_toofull > > > > services: > >mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3 > >mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2 > >mds: pagefs-2/2/2 up > {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active}, 1 up:standby > >osd: 10 osds: 10 up, 10 in; 189 remapped pgs > >rgw: 1 daemon active > > > > data: > >pools: 7 pools, 416 pgs > >objects: 169.5 M objects, 3.6 TiB > >usage: 39 TiB used, 34 TiB / 73 TiB avail > >pgs: 2/508449063 objects degraded (0.000%) > > 197313040/508449063 objects misplaced (38.807%) > > 224 active+clean > > 168 active+remapped+backfill_wait > > 16 active+remapped+backfill_wait+backfill_toofull > > 5 active+remapped+backfilling > > 2 active+recovery_toofull+degraded > > 1 active+recovery_toofull > > > > io: > >recovery: 1.1 MiB/s, 31 objects/s > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with CephFS - No space left on device
Hi Yoann, thanks for your response. Here are the results of the commands. root@pf-us1-dfs2:/var/log/ceph# ceph osd df ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS 0 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310 5 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271 6 hdd 7.27739 1.0 7.3 TiB 609 GiB 6.7 TiB 8.17 0.15 49 8 hdd 7.27739 1.0 7.3 TiB 2.5 GiB 7.3 TiB 0.030 42 1 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285 3 hdd 7.27739 1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296 7 hdd 7.27739 1.0 7.3 TiB 360 GiB 6.9 TiB 4.84 0.09 53 9 hdd 7.27739 1.0 7.3 TiB 4.1 GiB 7.3 TiB 0.06 0.00 38 2 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321 4 hdd 7.27739 1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351 TOTAL 73 TiB 39 TiB 34 TiB 53.13 MIN/MAX VAR: 0/1.79 STDDEV: 41.15 root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla gs hashpspool,full stripe_width 0 pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf or 0/439 flags hashpspool,full stripe_width 0 application cephfs pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 47 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha shpspool,full stripe_width 0 application rgw pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 47 1 flags hashpspool,full stripe_width 0 application rgw pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f lags hashpspool,full stripe_width 0 application rgw pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl ags hashpspool,full stripe_width 0 application rgw root@pf-us1-dfs2:/var/log/ceph# ceph osd tree ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF -1 72.77390 root default -3 29.10956 host pf-us1-dfs1 0 hdd 7.27739 osd.0up 1.0 1.0 5 hdd 7.27739 osd.5up 1.0 1.0 6 hdd 7.27739 osd.6up 1.0 1.0 8 hdd 7.27739 osd.8up 1.0 1.0 -5 29.10956 host pf-us1-dfs2 1 hdd 7.27739 osd.1up 1.0 1.0 3 hdd 7.27739 osd.3up 1.0 1.0 7 hdd 7.27739 osd.7up 1.0 1.0 9 hdd 7.27739 osd.9up 1.0 1.0 -7 14.55478 host pf-us1-dfs3 2 hdd 7.27739 osd.2up 1.0 1.0 4 hdd 7.27739 osd.4up 1.0 1.0 Thanks for your help guys. On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin wrote: > Hello, > > > Hi guys, I need your help. > > I'm new with Cephfs and we started using it as file storage. > > Today we are getting no space left on device but I'm seeing that we have > plenty space on the filesystem. > > Filesystem Size Used Avail Use% Mounted on > > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts > 73T 39T 35T 54% /mnt/cephfs > > > > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB > each but I'm getting the error "No space left on device" every time that > > I want to add a new file. > > After adding the 2 additional OSD disks I'm seeing that the load is > beign distributed among the cluster. > > Please I need your help. > > Could you give us the output of > > ceph osd df > ceph osd pool ls detail > ceph osd tree > > Best regards, > > -- > Yoann Moulin > EPFL IC-IT > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with CephFS - No space left on device
I believe I found something but I don't know how to fix it. I run "ceph df" and I'm seeing that cephfs_data and cephfs_metadata is at 100% USED. How can I increase the cephfs_data and cephfs_metadata pool. Sorry I'm new with Ceph. root@pf-us1-dfs1:/etc/ceph# ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 73 TiB 34 TiB 39 TiB 53.12 POOLS: NAMEID USED%USED MAX AVAIL OBJECTS poolcephfs 1 0 B 0 0 B 0 cephfs_data 2 3.6 TiB 100.00 0 B 169273821 cephfs_metadata 3 1.0 GiB 100.00 0 B 208981 .rgw.root 4 1.1 KiB 100.00 0 B 4 default.rgw.control 5 0 B 0 0 B 8 default.rgw.meta6 0 B 0 0 B 0 default.rgw.log 7 0 B 0 0 B 207 On Tue, Jan 8, 2019 at 10:30 AM Rodrigo Embeita wrote: > Hi guys, I need your help. > I'm new with Cephfs and we started using it as file storage. > Today we are getting no space left on device but I'm seeing that we have > plenty space on the filesystem. > Filesystem Size Used Avail Use% Mounted on > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts 73T > 39T 35T 54% /mnt/cephfs > > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB > each but I'm getting the error "No space left on device" every time that I > want to add a new file. > After adding the 2 additional OSD disks I'm seeing that the load is beign > distributed among the cluster. > Please I need your help. > > root@pf-us1-dfs1:/etc/ceph# ceph -s > cluster: >id: 609e9313-bdd3-449e-a23f-3db8382e71fb >health: HEALTH_ERR >2 backfillfull osd(s) >1 full osd(s) >7 pool(s) full >197313040/508449063 objects misplaced (38.807%) >Degraded data redundancy: 2/508449063 objects degraded > (0.000%), 2 pgs degraded >Degraded data redundancy (low space): 16 pgs backfill_toofull, > 3 pgs recovery_toofull > > services: >mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3 >mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2 >mds: pagefs-2/2/2 up > {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active}, 1 up:standby >osd: 10 osds: 10 up, 10 in; 189 remapped pgs >rgw: 1 daemon active > > data: >pools: 7 pools, 416 pgs >objects: 169.5 M objects, 3.6 TiB >usage: 39 TiB used, 34 TiB / 73 TiB avail >pgs: 2/508449063 objects degraded (0.000%) > 197313040/508449063 objects misplaced (38.807%) > 224 active+clean > 168 active+remapped+backfill_wait > 16 active+remapped+backfill_wait+backfill_toofull > 5 active+remapped+backfilling > 2 active+recovery_toofull+degraded > 1 active+recovery_toofull > > io: >recovery: 1.1 MiB/s, 31 objects/s > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Problem with CephFS - No space left on device
Hi guys, I need your help. I'm new with Cephfs and we started using it as file storage. Today we are getting no space left on device but I'm seeing that we have plenty space on the filesystem. Filesystem Size Used Avail Use% Mounted on 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts 73T 39T 35T 54% /mnt/cephfs We have 35TB of disk space. I've added 2 additional OSD disks with 7TB each but I'm getting the error "No space left on device" every time that I want to add a new file. After adding the 2 additional OSD disks I'm seeing that the load is beign distributed among the cluster. Please I need your help. root@pf-us1-dfs1:/etc/ceph# ceph -s cluster: id: 609e9313-bdd3-449e-a23f-3db8382e71fb health: HEALTH_ERR 2 backfillfull osd(s) 1 full osd(s) 7 pool(s) full 197313040/508449063 objects misplaced (38.807%) Degraded data redundancy: 2/508449063 objects degraded (0.000%), 2 pgs degraded Degraded data redundancy (low space): 16 pgs backfill_toofull, 3 pgs recovery_toofull services: mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3 mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2 mds: pagefs-2/2/2 up {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active}, 1 up:standby osd: 10 osds: 10 up, 10 in; 189 remapped pgs rgw: 1 daemon active data: pools: 7 pools, 416 pgs objects: 169.5 M objects, 3.6 TiB usage: 39 TiB used, 34 TiB / 73 TiB avail pgs: 2/508449063 objects degraded (0.000%) 197313040/508449063 objects misplaced (38.807%) 224 active+clean 168 active+remapped+backfill_wait 16 active+remapped+backfill_wait+backfill_toofull 5 active+remapped+backfilling 2 active+recovery_toofull+degraded 1 active+recovery_toofull io: recovery: 1.1 MiB/s, 31 objects/s ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with CephFS
Hi Daniel, thanks a lot for your help. Do you know how I can recover the data again in this scenario since I lost 1 node with 6 OSD? My configuration had 12 OSD (6 per host). Regards On Wed, Nov 21, 2018 at 3:16 PM Daniel Baumann wrote: > Hi, > > On 11/21/2018 07:04 PM, Rodrigo Embeita wrote: > > Reduced data availability: 7 pgs inactive, 7 pgs down > > this is your first problem: unless you have all data available again, > cephfs will not be back. > > after that, I would take care about the redundancy next, and get the one > missing monitor back online. > > once that is done, get the mds working again and your cephfs should be > back in service. > > if you encounter problems with any of the steps, send all the necessary > commands and outputs to the list and I (or others) can try to help. > > Regards, > Daniel > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Problem with CephFS
Hi guys, maybe someone can help me. I'm new with CephFS and I was testing the installation of Ceph Mimic with ceph-deploy in 2 ubuntu 16.04 nodes. These two nodes have 6 OSD disks each. I've installed CephFS and 2 MDS service. The problem is that I copied a lot of data (15 Millions of small files) and I lost one of this 2 nodes. The node lost had the MDS service working and is suppose to be moved to the other Ceph host but the MDS service got stuck on rejoin status. The problem is that the status of the cluster seems to be down and I'm not able to connect CephFS. root@pf-us1-dfs1:/var/log/ceph# ceph status cluster: id: 459cdedc-488e-49ed-8b16-36cf843cef76 health: HEALTH_WARN 1 filesystem is degraded 1 MDSs report slow metadata IOs 3 osds down 1 host (6 osds) down 5313/50445780 objects misplaced (0.011%) Reduced data availability: 7 pgs inactive, 7 pgs down Degraded data redundancy: 25192943/50445780 objects degraded (49.941%), 265 pgs degraded, 283 pgs undersized 1/3 mons down, quorum pf-us1-dfs3,pf-us1-dfs1 services: mon: 3 daemons, quorum pf-us1-dfs3,pf-us1-dfs1, out of quorum: pf-us1-dfs2 mgr: pf-us1-dfs3(active) mds: cephfs-1/1/1 up {0=pf-us1-dfs1=up:rejoin}, 1 up:standby osd: 13 osds: 6 up, 9 in; 6 remapped pgs rgw: 1 daemon active data: pools: 7 pools, 296 pgs objects: 25.22 M objects, 644 GiB usage: 2.0 TiB used, 42 TiB / 44 TiB avail pgs: 2.365% pgs not active 25192943/50445780 objects degraded (49.941%) 5313/50445780 objects misplaced (0.011%) 265 active+undersized+degraded 18 active+undersized 7 down 6 active+clean+remapped And the MDS service wrote the following on the log for over 14 hours and never stop. 2018-11-21 10:06:12.585 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18421 from mon.2 2018-11-21 10:06:16.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18422 from mon.2 2018-11-21 10:06:20.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18423 from mon.2 2018-11-21 10:06:24.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18424 from mon.2 2018-11-21 10:06:32.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18425 from mon.2 2018-11-21 10:06:36.594 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18426 from mon.2 2018-11-21 10:06:40.606 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18427 from mon.2 2018-11-21 10:06:44.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18428 from mon.2 2018-11-21 10:06:52.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18429 from mon.2 2018-11-21 10:06:56.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18430 from mon.2 2018-11-21 10:07:00.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18431 from mon.2 2018-11-21 10:07:04.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18432 from mon.2 2018-11-21 10:07:12.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18433 from mon.2 2018-11-21 10:07:16.602 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18434 from mon.2 2018-11-21 10:07:20.602 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18435 from mon.2 2018-11-21 10:07:24.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18436 from mon.2 2018-11-21 10:07:32.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18437 from mon.2 2018-11-21 10:07:36.614 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18438 from mon.2 2018-11-21 10:07:40.626 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to version 18439 from mon.2 Please someone help me. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com