Re: [ceph-users] ceph df shows 100% used
Hi, On Fri, Jan 19, 2018 at 8:31 PM, zhangbingyinwrote: > 'MAX AVAIL' in the 'ceph df' output represents the amount of data that can > be used before the first OSD becomes full, and not the sum of all free > space across a set of OSDs. > Thank you very much. I figured this out by the end of the day. That is the answer. I'm not sure this is in ceph.com docs though. Now I know the problem is indeed solved (by doing proper reweight). Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
'MAX AVAIL' in the 'ceph df' output represents the amount of data that can be used before the first OSD becomes full, and not the sum of all free space across a set of OSDs. 原始邮件 发件人: Webert de Souza Lima收件人: ceph-users 发送时间: 2018年1月19日(周五) 20:20主题: Re: [ceph-users] ceph df shows 100% usedWhile it seemed to be solved yesterday, today the %USED has grown a lot again. See: ~# ceph osd df tree http://termbin.com/0zhk ~# ceph df detail http://termbin.com/thox 94% USED while there is about 21TB worth of data, size = 2 menas ~42TB RAW Usage, but the OSDs in that root sum ~70TB available space. Regards, Webert LimaDevOps Engineer at MAV TecnologiaBelo Horizonte - BrasilIRC NICK - WebertRLZ On Thu, Jan 18, 2018 at 8:21 PM, Webert de Souza Lima wrote: With the help of robbat2 and llua on IRC channel I was able to solve this situation by taking down the 2-OSD only hosts. After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0, ceph df showed the expected storage capacity usage (about 70%) With this in mind, those guys have told me that it is due the cluster beeing uneven and unable to balance properly. It makes sense and it worked. But for me it is still a very unexpected bahaviour for ceph to say that the pools are 100% full and Available Space is 0. There were 3 hosts and repl. size = 2, if the host with only 2 OSDs were full (it wasn't), ceph could still use space from OSDs from the other hosts. Regards, Webert LimaDevOps Engineer at MAV TecnologiaBelo Horizonte - BrasilIRC NICK - WebertRLZ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
While it seemed to be solved yesterday, today the %USED has grown a lot again. See: ~# ceph osd df tree http://termbin.com/0zhk ~# ceph df detail http://termbin.com/thox 94% USED while there is about 21TB worth of data, size = 2 menas ~42TB RAW Usage, but the OSDs in that root sum ~70TB available space. Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Thu, Jan 18, 2018 at 8:21 PM, Webert de Souza Limawrote: > With the help of robbat2 and llua on IRC channel I was able to solve this > situation by taking down the 2-OSD only hosts. > After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0, > ceph df showed the expected storage capacity usage (about 70%) > > > With this in mind, those guys have told me that it is due the cluster > beeing uneven and unable to balance properly. It makes sense and it worked. > But for me it is still a very unexpected bahaviour for ceph to say that > the pools are 100% full and Available Space is 0. > > There were 3 hosts and repl. size = 2, if the host with only 2 OSDs were > full (it wasn't), ceph could still use space from OSDs from the other hosts. > > Regards, > > Webert Lima > DevOps Engineer at MAV Tecnologia > *Belo Horizonte - Brasil* > *IRC NICK - WebertRLZ* > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
With the help of robbat2 and llua on IRC channel I was able to solve this situation by taking down the 2-OSD only hosts. After crush reweighting OSDs 8 and 23 from host mia1-master-fe02 to 0, ceph df showed the expected storage capacity usage (about 70%) With this in mind, those guys have told me that it is due the cluster beeing uneven and unable to balance properly. It makes sense and it worked. But for me it is still a very unexpected bahaviour for ceph to say that the pools are 100% full and Available Space is 0. There were 3 hosts and repl. size = 2, if the host with only 2 OSDs were full (it wasn't), ceph could still use space from OSDs from the other hosts. Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
Hi David, thanks for replying. On Thu, Jan 18, 2018 at 5:03 PM David Turnerwrote: > You can have overall space available in your cluster because not all of > your disks are in the same crush root. You have multiple roots > corresponding to multiple crush rulesets. All pools using crush ruleset 0 > are full because all of the osds in that crush rule are full. > So I did check this. The usage of the OSDs that belonged to that root (default) was about 60%. All the pools using crush ruleset 0 were being show 100% there was only 1 near-full OSD in that crush rule. That's what is so weird about it. On Thu, Jan 18, 2018 at 8:05 PM, David Turner wrote: > `ceph osd df` is a good command for you to see what's going on. Compare > the osd numbers with `ceph osd tree`. > I am sorry I forgot to send this output, here it is. I have added 2 OSDs to that crush, borrowed them from the host mia1-master-ds05, to see if the available space would higher, but it didn't. So adding new OSDs to this didn't take any effect. ceph osd df tree ID WEIGHT REWEIGHT SIZE USEAVAIL %USE VAR PGS TYPE NAME -9 13.5- 14621G 2341G 12279G 16.02 0.31 0 root databases -8 6.5- 7182G 835G 6346G 11.64 0.22 0 host mia1-master-ds05 20 3.0 1.0 3463G 380G 3082G 10.99 0.21 260 osd.20 17 3.5 1.0 3719G 455G 3263G 12.24 0.24 286 osd.17 -10 7.0- 7438G 1505G 5932G 20.24 0.39 0 host mia1-master-fe01 21 3.5 1.0 3719G 714G 3004G 19.22 0.37 269 osd.21 22 3.5 1.0 3719G 791G 2928G 21.27 0.41 295 osd.22 -3 2.39996- 2830G 1647G 1182G 58.22 1.12 0 root databases-ssd -5 1.19998- 1415G 823G 591G 58.22 1.12 0 host mia1-master-ds02-ssd 24 0.3 1.0 471G 278G 193G 58.96 1.14 173 osd.24 25 0.3 1.0 471G 276G 194G 58.68 1.13 172 osd.25 26 0.3 1.0 471G 269G 202G 57.03 1.10 167 osd.26 -6 1.19998- 1415G 823G 591G 58.22 1.12 0 host mia1-master-ds03-ssd 27 0.3 1.0 471G 244G 227G 51.87 1.00 152 osd.27 28 0.3 1.0 471G 281G 190G 59.63 1.15 175 osd.28 29 0.3 1.0 471G 297G 173G 63.17 1.22 185 osd.29 -1 71.69997- 76072G 44464G 31607G 58.45 1.13 0 root default -2 26.59998- 29575G 17334G 12240G 58.61 1.13 0 host mia1-master-ds01 0 3.2 1.0 3602G 1907G 1695G 52.94 1.02 90 osd.0 1 3.2 1.0 3630G 2721G 908G 74.97 1.45 112 osd.1 2 3.2 1.0 3723G 2373G 1349G 63.75 1.23 98 osd.2 3 3.2 1.0 3723G 1781G 1941G 47.85 0.92 105 osd.3 4 3.2 1.0 3723G 1880G 1843G 50.49 0.97 95 osd.4 5 3.2 1.0 3723G 2465G 1257G 66.22 1.28 111 osd.5 6 3.7 1.0 3723G 1722G 2001G 46.25 0.89 109 osd.6 7 3.7 1.0 3723G 2481G 1241G 66.65 1.29 126 osd.7 -4 8.5- 9311G 8540G 770G 91.72 1.77 0 host mia1-master-fe02 8 5.5 0.7 5587G 5419G 167G 97.00 1.87 189 osd.8 23 3.0 1.0 3724G 3120G 603G 83.79 1.62 128 osd.23 -7 29.5- 29747G 17821G 11926G 59.91 1.16 0 host mia1-master-ds04 9 3.7 1.0 3718G 2493G 1224G 67.07 1.29 114 osd.9 10 3.7 1.0 3718G 2454G 1264G 66.00 1.27 90 osd.10 11 3.7 1.0 3718G 2202G 1516G 59.22 1.14 116 osd.11 12 3.7 1.0 3718G 2290G 1427G 61.61 1.19 113 osd.12 13 3.7 1.0 3718G 2015G 1703G 54.19 1.05 112 osd.13 14 3.7 1.0 3718G 1264G 2454G 34.00 0.66 101 osd.14 15 3.7 1.0 3718G 2195G 1522G 59.05 1.14 104 osd.15 16 3.7 1.0 3718G 2905G 813G 78.13 1.51 130 osd.16 -11 7.0- 7438G 768G 6669G 10.33 0.20 0 host mia1-master-ds05-borrowed-osds 18 3.5 1.0 3719G 393G 3325G 10.59 0.20 262 osd.18 19 3.5 1.0 3719G 374G 3344G 10.07 0.19 256 osd.19 TOTAL 93524G 48454G 45069G 51.81 MIN/MAX VAR: 0.19/1.87 STDDEV: 22.02 Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Thu, Jan 18, 2018 at 8:05 PM, David Turner wrote: > `ceph osd df` is a good command for you to see what's going on. Compare > the osd numbers with `ceph osd tree`. > > >> >> On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima < >> webert.b...@gmail.com> wrote: >> >>> Sorry I forgot, this is a ceph jewel 10.2.10 >>> >>> >>> Regards, >>> >>> Webert Lima >>> DevOps Engineer at MAV Tecnologia >>> *Belo Horizonte - Brasil* >>> *IRC NICK - WebertRLZ* >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] ceph df shows 100% used
You hosts are also not balanced in your default root. Your failure domain is host, but one of your hosts has 8.5TB of storage in it compared to 26.6TB and 29.6TB. You only have size=2 (along with min_size=1 which is bad for a lot of reasons) so it should still be able to place data mostly between ds01 and ds04 and ignore fe02 since it doesn't have much space at all. Anyway, `ceph osd df` will be good output to see what the distribution between osds looks like. -1 64.69997 root default -2 26.59998 host mia1-master-ds01 0 3.2 osd.0 up 1.0 1.0 1 3.2 osd.1 up 1.0 1.0 2 3.2 osd.2 up 1.0 1.0 3 3.2 osd.3 up 1.0 1.0 4 3.2 osd.4 up 1.0 1.0 5 3.2 osd.5 up 1.0 1.0 6 3.7 osd.6 up 1.0 1.0 7 3.7 osd.7 up 1.0 1.0 -4 8.5 host mia1-master-fe02 8 5.5 osd.8 up 1.0 1.0 23 3.0 osd.23 up 1.0 1.0 -7 29.5 host mia1-master-ds04 9 3.7 osd.9 up 1.0 1.0 10 3.7 osd.10 up 1.0 1.0 11 3.7 osd.11 up 1.0 1.0 12 3.7 osd.12 up 1.0 1.0 13 3.7 osd.13 up 1.0 1.0 14 3.7 osd.14 up 1.0 1.0 15 3.7 osd.15 up 1.0 1.0 16 3.7 osd.16 up 1.0 1.0 On Thu, Jan 18, 2018 at 5:05 PM David Turnerwrote: > `ceph osd df` is a good command for you to see what's going on. Compare > the osd numbers with `ceph osd tree`. > > On Thu, Jan 18, 2018 at 5:03 PM David Turner > wrote: > >> You can have overall space available in your cluster because not all of >> your disks are in the same crush root. You have multiple roots >> corresponding to multiple crush rulesets. All pools using crush ruleset 0 >> are full because all of the osds in that crush rule are full. >> >> On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima < >> webert.b...@gmail.com> wrote: >> >>> Sorry I forgot, this is a ceph jewel 10.2.10 >>> >>> >>> Regards, >>> >>> Webert Lima >>> DevOps Engineer at MAV Tecnologia >>> *Belo Horizonte - Brasil* >>> *IRC NICK - WebertRLZ* >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
`ceph osd df` is a good command for you to see what's going on. Compare the osd numbers with `ceph osd tree`. On Thu, Jan 18, 2018 at 5:03 PM David Turnerwrote: > You can have overall space available in your cluster because not all of > your disks are in the same crush root. You have multiple roots > corresponding to multiple crush rulesets. All pools using crush ruleset 0 > are full because all of the osds in that crush rule are full. > > On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Lima < > webert.b...@gmail.com> wrote: > >> Sorry I forgot, this is a ceph jewel 10.2.10 >> >> >> Regards, >> >> Webert Lima >> DevOps Engineer at MAV Tecnologia >> *Belo Horizonte - Brasil* >> *IRC NICK - WebertRLZ* >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
You can have overall space available in your cluster because not all of your disks are in the same crush root. You have multiple roots corresponding to multiple crush rulesets. All pools using crush ruleset 0 are full because all of the osds in that crush rule are full. On Thu, Jan 18, 2018 at 3:34 PM Webert de Souza Limawrote: > Sorry I forgot, this is a ceph jewel 10.2.10 > > > Regards, > > Webert Lima > DevOps Engineer at MAV Tecnologia > *Belo Horizonte - Brasil* > *IRC NICK - WebertRLZ* > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
Sorry I forgot, this is a ceph jewel 10.2.10 Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df shows 100% used
Also, there is no quota set for the pools Here is "ceph osd pool get xxx all": http://termbin.com/ix0n Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph df shows 100% used
Hello, I'm running near-out-of service radosgw (very slow to write new objects) and I suspect it's because of ceph df is showing 100% usage in some pools, though I don't know what that information comes from. Pools: #~ ceph osd pool ls detail -> http://termbin.com/lsd0 Crush Rules (important is rule 0) ~# ceph osd crush rule dump -> http://termbin.com/wkpo OSD Tree: ~# ceph osd tree -> http://termbin.com/87vt Ceph DF, which shows 100% Usage: ~# ceph df detail -> http://termbin.com/15mz Ceph Status, which shows 45600 GB / 93524 GB avail: ~# ceph -s -> http://termbin.com/wycq Any thoughts? Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com