> Try adding the '-p' flag here? That should show preallocated extents. Would be interesting to run it on some index file which is larger than 1MB, for example.
# du -h --apparent-size index.000000108 23M index.000000108 # du -h index.000000108 23M index.000000108 # xfs_bmap -v -p index.000000108 index.000000108: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS 0: [0..2719]: 1175815920..1175818639 2 (3704560..3707279) 2720 00000 1: [2720..5111]: 1175828904..1175831295 2 (3717544..3719935) 2392 00000 2: [5112..7767]: 1175835592..1175838247 2 (3724232..3726887) 2656 00000 3: [7768..10567]: 1175849896..1175852695 2 (3738536..3741335) 2800 00000 4: [10568..15751]: 1175877808..1175882991 2 (3766448..3771631) 5184 00000 5: [15752..18207]: 1175898864..1175901319 2 (3787504..3789959) 2456 00000 6: [18208..20759]: 1175909192..1175911743 2 (3797832..3800383) 2552 00000 7: [20760..23591]: 1175921616..1175924447 2 (3810256..3813087) 2832 00000 8: [23592..26207]: 1175974872..1175977487 2 (3863512..3866127) 2616 00000 9: [26208..28799]: 1175989496..1175992087 2 (3878136..3880727) 2592 00000 10: [28800..31199]: 1175998552..1176000951 2 (3887192..3889591) 2400 00000 11: [31200..33895]: 1176008336..1176011031 2 (3896976..3899671) 2696 00000 12: [33896..36591]: 1176031696..1176034391 2 (3920336..3923031) 2696 00000 13: [36592..39191]: 1176037440..1176040039 2 (3926080..3928679) 2600 00000 14: [39192..41839]: 1176072008..1176074655 2 (3960648..3963295) 2648 00000 15: [41840..44423]: 1176097752..1176100335 2 (3986392..3988975) 2584 00000 16: [44424..46879]: 1176132144..1176134599 2 (4020784..4023239) 2456 00000 ср, 19 июн. 2019 г. в 10:56, Todd Lipcon <t...@cloudera.com>: > > > On Wed, Jun 19, 2019 at 12:49 AM Pavel Martynov <mr.xk...@gmail.com> > wrote: > >> Hi Todd, thanks for the answer! >> >> > Any chance you've done something like copy the files away and back that >> might cause them to lose their sparseness? >> >> No, I don't think so. Recently we experienced some problems with >> stability with Kudu, and ran rebalance a couple of times, if this related. >> But we never used fs commands like cp/mv against Kudu dirs. >> >> I ran du on all-WALs dir: >> # du -sh /mnt/data01/kudu-tserver-wal/ >> 12G /mnt/data01/kudu-tserver-wal/ >> >> # du -sh --apparent-size /mnt/data01/kudu-tserver-wal/ >> 25G /mnt/data01/kudu-tserver-wal/ >> >> And on WAL with a many indexes: >> # du -sh --apparent-size >> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >> 306M /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >> >> # du -sh >> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >> 296M /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >> >> >> > Also, any chance you're using XFS here? >> >> Yes, exactly XFS. We use CentOS 7.6. >> >> What is interesting, there are no many holes in index files in >> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f (WAL dir >> that I mention before). Only single hole in single index file (of 13 files): >> # xfs_bmap -v index.000000120 >> > > Try adding the '-p' flag here? That should show preallocated extents. > Would be interesting to run it on some index file which is larger than 1MB, > for example. > > >> index.000000120: >> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL >> 0: [0..4231]: 1176541248..1176545479 2 (4429888..4434119) 4232 >> 1: [4232..9815]: 1176546592..1176552175 2 (4435232..4440815) 5584 >> 2: [9816..11583]: 1176552832..1176554599 2 (4441472..4443239) 1768 >> 3: [11584..13319]: 1176558672..1176560407 2 (4447312..4449047) 1736 >> 4: [13320..15239]: 1176565336..1176567255 2 (4453976..4455895) 1920 >> 5: [15240..17183]: 1176570776..1176572719 2 (4459416..4461359) 1944 >> 6: [17184..18999]: 1176575856..1176577671 2 (4464496..4466311) 1816 >> 7: [19000..20927]: 1176593552..1176595479 2 (4482192..4484119) 1928 >> 8: [20928..22703]: 1176599128..1176600903 2 (4487768..4489543) 1776 >> 9: [22704..24575]: 1176602704..1176604575 2 (4491344..4493215) 1872 >> 10: [24576..26495]: 1176611936..1176613855 2 (4500576..4502495) 1920 >> 11: [26496..26655]: 1176615040..1176615199 2 (4503680..4503839) 160 >> 12: [26656..46879]: hole 20224 >> >> But in some other WAL I see like this: >> # xfs_bmap -v >> /mnt/data01/kudu-tserver-wal/wals/508ecdfa8904bdb97a02078a91822af/index.000000000 >> >> /mnt/data01/kudu-tserver-wal/wals/508ecdfa89054bdb97a02078a91822af/index.000000000: >> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL >> 0: [0..7]: 1758753776..1758753783 3 (586736..586743) 8 >> 1: [8..46879]: hole 46872 >> >> Looks like there actually used only 8 blocks and all other blocks are the >> hole. >> >> >> So looks like I can use formulas with confidence. >> Normal case: 8 MB/segment * 80 max segments * 2000 tablets = 1,280,000 MB >> = ~1.3 TB (+ some minor index overhead) >> Worse case: 8 MB/segment * 1 segment * 2000 tablets = 1,280,000 MB = ~16 >> GB (+ some minor index overhead) >> >> Right? >> >> >> ср, 19 июн. 2019 г. в 09:35, Todd Lipcon <t...@cloudera.com>: >> >>> Hi Pavel, >>> >>> That's not quite expected. For example, on one of our test clusters >>> here, we have about 65GB of WALs and about 1GB of index files. If I recall >>> correctly, the index files store 8 bytes per WAL entry, so typically a >>> couple orders of magnitude smaller than the WALs themselves. >>> >>> One thing is that the index files are sparse. Any chance you've done >>> something like copy the files away and back that might cause them to lose >>> their sparseness? If I use du --apparent-size on mine, it's total of about >>> 180GB vs the 1GB of actual size. >>> >>> Also, any chance you're using XFS here? XFS sometimes likes to >>> preallocate large amounts of data into files while they're open, and only >>> frees it up if disk space is contended. I think you can use 'xfs_bmap' on >>> an index file to see the allocation status, which might be interesting. >>> >>> -Todd >>> >>> On Tue, Jun 18, 2019 at 11:12 PM Pavel Martynov <mr.xk...@gmail.com> >>> wrote: >>> >>>> Hi guys! >>>> >>>> We want to buy SSDs for TServers WALs for our cluster. I'm working on >>>> capacity estimation for this SSDs using "Getting Started with Kudu" book, >>>> Chapter 4, Write-Ahead Log ( >>>> https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html >>>> <https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html#idm139738927926240> >>>> ). >>>> >>>> NB: we use default Kudu WAL configuration settings. >>>> >>>> There is a formula for worse-case: >>>> 8 MB/segment * 80 max segments * 2000 tablets = 1,280,000 MB = ~1.3 TB >>>> >>>> So, this formula takes into account only segment files. But in our >>>> cluster, I see that every segment file has >= 1 corresponding index files. >>>> And every index file actually larger than segment file. >>>> >>>> Numbers from one of our nodes. >>>> WALs count: >>>> $ ls /mnt/data01/kudu-tserver-wal/wals/ | wc -l >>>> 711 >>>> >>>> Overall WAL size: >>>> $ du -d 0 -h /mnt/data01/kudu-tserver-wal/ >>>> 13G /mnt/data01/kudu-tserver-wal/ >>>> >>>> Size of all segment files: >>>> $ find /mnt/data01/kudu-tserver-wal/ -type f -name 'wal-*' -exec du -ch >>>> {} + | grep total$ >>>> 6.1G total >>>> >>>> Size of all index files: >>>> $ find /mnt/data01/kudu-tserver-wal/ -type f -name 'index*' -exec du >>>> -ch {} + | grep total$ >>>> 6.5G total >>>> >>>> So I have questions. >>>> >>>> 1. How can I estimate the size of index files? >>>> Looks like in our cluster size of index files approximately equal to >>>> size segment files. >>>> >>>> 2. There is some WALs with more than one index files. For example: >>>> $ ls -lh >>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f/ >>>> total 296M >>>> -rw-r--r-- 1 root root 23M Jun 18 21:31 index.000000108 >>>> -rw-r--r-- 1 root root 23M Jun 18 21:41 index.000000109 >>>> -rw-r--r-- 1 root root 23M Jun 18 21:52 index.000000110 >>>> -rw-r--r-- 1 root root 23M Jun 18 22:10 index.000000111 >>>> -rw-r--r-- 1 root root 23M Jun 18 22:22 index.000000112 >>>> -rw-r--r-- 1 root root 23M Jun 18 22:35 index.000000113 >>>> -rw-r--r-- 1 root root 23M Jun 18 22:48 index.000000114 >>>> -rw-r--r-- 1 root root 23M Jun 18 23:01 index.000000115 >>>> -rw-r--r-- 1 root root 23M Jun 18 23:14 index.000000116 >>>> -rw-r--r-- 1 root root 23M Jun 18 23:27 index.000000117 >>>> -rw-r--r-- 1 root root 23M Jun 18 23:40 index.000000118 >>>> -rw-r--r-- 1 root root 23M Jun 18 23:52 index.000000119 >>>> -rw-r--r-- 1 root root 23M Jun 19 01:13 index.000000120 >>>> -rw-r--r-- 1 root root 8.0M Jun 19 01:13 wal-000007799 >>>> >>>> Is this a normal situation? >>>> >>>> 3. Not a question. Please, consider adding documentation about the >>>> estimation of WAL storage. Also, I can't found any mentions about index >>>> files, except here >>>> https://kudu.apache.org/docs/scaling_guide.html#file_descriptors. >>>> >>>> Thanks! >>>> >>>> -- >>>> with best regards, Pavel Martynov >>>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >> >> >> -- >> with best regards, Pavel Martynov >> > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- with best regards, Pavel Martynov