[ceph-users] Re: mons excessive writes to local disk and SSD wearout
On 2/27/23 03:22, Andrej Filipcic wrote: On 2/24/23 15:18, Dan van der Ster wrote: Hi Andrej, That doesn't sound right -- I checked a couple of our clusters just now and the mon filesystem is writing at just a few 100kBps. most of the time it's few 10kB/s, but then it jumps a lot, few times a minute. did you measure it for a long time, eg 1min? debug_mon = 10 should clarify the root cause. Perhaps it's logm from some persistent slow ops? nothing like that, the contents of the log looks normal more or less, just more chatty than on other clusters I have compared to. Best, Andrej Compression is the first thing that comes to mind, though I'd caution you to be careful. Try it out on a test cluster first if possible. You could also try tuning rocksdb's other configurables, but it's been a long time since anyone looked at rocksdb tuning for the mons afaik. I'd suggest reading through our OSD rocksdb tuning article to get some ideas: https://ceph.io/en/news/blog/2022/rocksdb-tuning-deep-dive/ Hope that helps! Mark Cheers, Dan On Fri, Feb 24, 2023 at 7:36 AM Andrej Filipcic wrote: Hi, on our large ceph cluster with 60 servers, 1600 OSDs, we have observed that small system nvmes are wearing out rapidly. Our monitoring shows mon writes on average about 10MB/s to store.db. For small system nvmes of 250GB and DWPD of ~1, this turns out to be too much, 0.8TB/day or 1.5PB in 5 years, too much even for 3DWPD of the same capacity. Apart from replacing the drives with larger ones, more durable, preferably both, do you have any suggestions if these writes can be reduced? Actually, the mon writes match 0.15Hz rate of .sst file creation of 64MB Best regards, Andrej -- _ prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674 Fax: +386-1-425-7074 - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: mons excessive writes to local disk and SSD wearout
On 2/24/23 15:18, Dan van der Ster wrote: Hi Andrej, That doesn't sound right -- I checked a couple of our clusters just now and the mon filesystem is writing at just a few 100kBps. most of the time it's few 10kB/s, but then it jumps a lot, few times a minute. did you measure it for a long time, eg 1min? debug_mon = 10 should clarify the root cause. Perhaps it's logm from some persistent slow ops? nothing like that, the contents of the log looks normal more or less, just more chatty than on other clusters I have compared to. Best, Andrej Cheers, Dan On Fri, Feb 24, 2023 at 7:36 AM Andrej Filipcic wrote: Hi, on our large ceph cluster with 60 servers, 1600 OSDs, we have observed that small system nvmes are wearing out rapidly. Our monitoring shows mon writes on average about 10MB/s to store.db. For small system nvmes of 250GB and DWPD of ~1, this turns out to be too much, 0.8TB/day or 1.5PB in 5 years, too much even for 3DWPD of the same capacity. Apart from replacing the drives with larger ones, more durable, preferably both, do you have any suggestions if these writes can be reduced? Actually, the mon writes match 0.15Hz rate of .sst file creation of 64MB Best regards, Andrej -- _ prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674Fax: +386-1-425-7074 - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- _ prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674Fax: +386-1-477-3166 - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: mons excessive writes to local disk and SSD wearout
Hi Andrej, That doesn't sound right -- I checked a couple of our clusters just now and the mon filesystem is writing at just a few 100kBps. debug_mon = 10 should clarify the root cause. Perhaps it's logm from some persistent slow ops? Cheers, Dan On Fri, Feb 24, 2023 at 7:36 AM Andrej Filipcic wrote: > > > Hi, > > on our large ceph cluster with 60 servers, 1600 OSDs, we have observed > that small system nvmes are wearing out rapidly. Our monitoring shows > mon writes on average about 10MB/s to store.db. For small system nvmes > of 250GB and DWPD of ~1, this turns out to be too much, 0.8TB/day or > 1.5PB in 5 years, too much even for 3DWPD of the same capacity. > > Apart from replacing the drives with larger ones, more durable, > preferably both, do you have any suggestions if these writes can be > reduced? Actually, the mon writes match 0.15Hz rate of .sst file > creation of 64MB > > Best regards, > Andrej > > -- > _ > prof. dr. Andrej Filipcic, E-mail: andrej.filip...@ijs.si > Department of Experimental High Energy Physics - F9 > Jozef Stefan Institute, Jamova 39, P.o.Box 3000 > SI-1001 Ljubljana, Slovenia > Tel.: +386-1-477-3674Fax: +386-1-425-7074 > - > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io