Re: \o/ compsize
On Mon, Sep 04, 2017 at 08:42:29PM +0200, Adam Borowski wrote: > On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: > > 2017-09-04 18:11 GMT+03:00 Adam Borowski : > > > Here's an utility to measure used compression type + ratio on a set of > > > files > > > or directories: https://github.com/kilobyte/compsize > > > > > > It should be of great help for users, and also if you: > > > * muck with compression levels > > > * add new compression types > > > * add heurestics that could err on withholding compression too much > > > > Packaged to AUR: > > https://aur.archlinux.org/packages/compsize-git/ > > Cool! I'd wait until people say the code is sane (I don't really know these > ioctls) but if you want to make poor AUR folks our beta testers, that's ok. > > However, one issue: I did not set a license; your packaging says GPL3. > It would be better to have something compatible with btrfs-progs which are > GPL2-only. What about GPL2-or-higher? > > After adding some related info (like wasted space in pinned extents, reuse > of extents), it'd be nice to have this tool inside btrfs-progs, either as a > part of "fi du" or another command. I've now implemented a prototype that calculates the compressed size of extents per-file. As 'fi du' knows about what extents are shared, the compression can be also calculated shared/exclusive. There's no summary like compsize does, this would need a bit more precise tracking of the extents, not just the compressed size but also which algo was used. I can imagine all sorts of output enhancements, like summarize the inline-compressed extents or print the algo summary per-file. This should be easy once the calclation code is there. I haven't reused compsize.c, as I needed only the ioctl part and wire it to 'fi du', but the search ioctl is the same. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On Mon, Sep 04, 2017 at 10:33:40PM +0200, A L wrote: > On 9/4/2017 5:11 PM, Adam Borowski wrote: > > Hi! > > Here's an utility to measure used compression type + ratio on a set of files > > or directories: https://github.com/kilobyte/compsize > > Great tool. Just tried it on some of my backup snapshots. > > # compsize portage.20170904T2200 > 142432 files. > all 78% 329M/ 422M > none 100% 227M/ 227M > zlib 52% 102M/ 195M > > # du -sh portage.20170904T2200 > 787M portage.20170904T2200 > > # btrfs fi du -s portage.20170904T2200 > Total Exclusive Set shared Filename > 271.61MiB 6.34MiB 245.51MiB portage.20170904T2200 > > Interesting results. How do I interpret them? I've added some documentation; especially in the man page. (Sorry for not pushing this earlier, Timofey went wild on this tool and I wanted to avoid conflicts.) > Compsize also doesn't seem to like some non-standard files and throws an > error (even though they should be ignored?): > > # compsize usb-backup/volumes/root/root.20170727T2321/ > open("usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350"): > No such device or address > > # dir > usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350 > srwx-- 1 root root 0 Dec 31 2015 > usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350= Fixed. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 9/4/17, 8:12 AM, "Adam Borowski" wrote: > Hi! > Here's an utility to measure used compression type + ratio on a set of files > or directories: https://github.com/kilobyte/compsize > > It should be of great help for users, and also if you: > * muck with compression levels > * add new compression types > * add heurestics that could err on withholding compression too much Thanks for writing this tool Adam, I'll try it out with zstd! It looks very useful for benchmarking compression algorithms, much better than measuring the filesystem size with du/df. > (Thanks for Knorrie and his python-btrfs project that made figuring out the > ioctls much easier.) > > Meow! > -- > ⢀⣴⠾⠻⢶⣦⠀ > ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? > ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din > ⠈⠳⣄ > N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: \o/ compsize
On 2017年09月05日 22:21, Hans van Kranenburg wrote: On 09/05/2017 04:02 PM, Qu Wenruo wrote: On 2017年09月05日 03:52, Timofey Titovets wrote: 2017-09-04 21:42 GMT+03:00 Adam Borowski : On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: 2017-09-04 18:11 GMT+03:00 Adam Borowski : Here's an utility to measure used compression type + ratio on a set of files or directories: https://github.com/kilobyte/compsize It should be of great help for users, and also if you: * muck with compression levels * add new compression types * add heurestics that could err on withholding compression too much Did a brief review, and the result looks quite good. Especially same disk bytenr is handled well, so same file extent referring to different part of the large extent won't get count twice. Nice job. But still some smaller improvement can be done: (Please keep in mind I can go totally wrong since I'm not doing a comprehensive review) Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, which should filtered out unrelated results. No, it does not. https://patchwork.kernel.org/patch/9767619/ Why not? Min key = ino, EXTENT_DATA, 0 Max key = ino, EXTENT_DATA, -1 With that min_key and max_key, the result is just what we want. This also filtered out any item not belongs to this ino, and other things like XATTR or whatever. Thanks, Qu And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined functions will be a big improvement for reviewers. (So I can check if the magic numbers are right or not, since I'm a lazy bone and don't want to manually calculate the offset) Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ Nice, I don't even need to build it myself! (Well, no much dependency anyway) Cool! I'd wait until people say the code is sane (I don't really know these ioctls) but if you want to make poor AUR folks our beta testers, that's ok. The code is sane! And it even considered inline extent! (Which I didn't consider BTW as inline extent counts as metadata, not data so my first thought just is to just ignore them). This just are too handy =>> However, one issue: I did not set a license; your packaging says GPL3. It would be better to have something compatible with btrfs-progs which are GPL2-only. What about GPL2-or-higher? Sorry for license, just copy-paste error, fixed After adding some related info (like wasted space in pinned extents, reuse of extents), it'd be nice to have this tool inside btrfs-progs, either as a part of "fi du" or another command. That will be useful => If improved, I think there is the chance to get it into btrfs-progs. Thanks, Qu P.S. your code work amazing fast on my ssd and data %) 150Gb data -O0 2.12s -O2 0.51s -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 09/05/2017 04:02 PM, Qu Wenruo wrote: > > > On 2017年09月05日 03:52, Timofey Titovets wrote: >> 2017-09-04 21:42 GMT+03:00 Adam Borowski : >>> On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: 2017-09-04 18:11 GMT+03:00 Adam Borowski : > Here's an utility to measure used compression type + ratio on a set > of files > or directories: https://github.com/kilobyte/compsize > > It should be of great help for users, and also if you: > * muck with compression levels > * add new compression types > * add heurestics that could err on withholding compression too much > > Did a brief review, and the result looks quite good. > Especially same disk bytenr is handled well, so same file extent > referring to different part of the large extent won't get count twice. > > Nice job. > > But still some smaller improvement can be done: > (Please keep in mind I can go totally wrong since I'm not doing a > comprehensive review) > > Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, > which should filtered out unrelated results. No, it does not. https://patchwork.kernel.org/patch/9767619/ > And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined > functions will be a big improvement for reviewers. > (So I can check if the magic numbers are right or not, since I'm a lazy > bone and don't want to manually calculate the offset) > Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ > > Nice, I don't even need to build it myself! > (Well, no much dependency anyway) > >>> >>> Cool! I'd wait until people say the code is sane (I don't really >>> know these >>> ioctls) but if you want to make poor AUR folks our beta testers, >>> that's ok. > > The code is sane! > And it even considered inline extent! (Which I didn't consider BTW as > inline extent counts as metadata, not data so my first thought just is > to just ignore them). > >> >> This just are too handy =) >> >>> However, one issue: I did not set a license; your packaging says GPL3. >>> It would be better to have something compatible with btrfs-progs >>> which are >>> GPL2-only. What about GPL2-or-higher? >> >> Sorry for license, just copy-paste error, fixed >> >>> After adding some related info (like wasted space in pinned extents, >>> reuse >>> of extents), it'd be nice to have this tool inside btrfs-progs, >>> either as a >>> part of "fi du" or another command. >> >> That will be useful =) > > If improved, I think there is the chance to get it into btrfs-progs. > > Thanks, > Qu > >> >> P.S. >> your code work amazing fast on my ssd and data %) >> 150Gb data >> -O0 2.12s >> -O2 0.51s >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 2017年09月05日 03:52, Timofey Titovets wrote: 2017-09-04 21:42 GMT+03:00 Adam Borowski : On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: 2017-09-04 18:11 GMT+03:00 Adam Borowski : Here's an utility to measure used compression type + ratio on a set of files or directories: https://github.com/kilobyte/compsize It should be of great help for users, and also if you: * muck with compression levels * add new compression types * add heurestics that could err on withholding compression too much Did a brief review, and the result looks quite good. Especially same disk bytenr is handled well, so same file extent referring to different part of the large extent won't get count twice. Nice job. But still some smaller improvement can be done: (Please keep in mind I can go totally wrong since I'm not doing a comprehensive review) Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, which should filtered out unrelated results. And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined functions will be a big improvement for reviewers. (So I can check if the magic numbers are right or not, since I'm a lazy bone and don't want to manually calculate the offset) Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ Nice, I don't even need to build it myself! (Well, no much dependency anyway) Cool! I'd wait until people say the code is sane (I don't really know these ioctls) but if you want to make poor AUR folks our beta testers, that's ok. The code is sane! And it even considered inline extent! (Which I didn't consider BTW as inline extent counts as metadata, not data so my first thought just is to just ignore them). This just are too handy =) However, one issue: I did not set a license; your packaging says GPL3. It would be better to have something compatible with btrfs-progs which are GPL2-only. What about GPL2-or-higher? Sorry for license, just copy-paste error, fixed After adding some related info (like wasted space in pinned extents, reuse of extents), it'd be nice to have this tool inside btrfs-progs, either as a part of "fi du" or another command. That will be useful =) If improved, I think there is the chance to get it into btrfs-progs. Thanks, Qu P.S. your code work amazing fast on my ssd and data %) 150Gb data -O0 2.12s -O2 0.51s -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 9/4/2017 5:11 PM, Adam Borowski wrote: Hi! Here's an utility to measure used compression type + ratio on a set of files or directories: https://github.com/kilobyte/compsize Great tool. Just tried it on some of my backup snapshots. # compsize portage.20170904T2200 142432 files. all 78% 329M/ 422M none 100% 227M/ 227M zlib 52% 102M/ 195M # du -sh portage.20170904T2200 787M portage.20170904T2200 # btrfs fi du -s portage.20170904T2200 Total Exclusive Set shared Filename 271.61MiB 6.34MiB 245.51MiB portage.20170904T2200 Interesting results. How do I interpret them? Compsize also doesn't seem to like some non-standard files and throws an error (even though they should be ignored?): # compsize usb-backup/volumes/root/root.20170727T2321/ open("usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350"): No such device or address # dir usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350 srwx-- 1 root root 0 Dec 31 2015 usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350= -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
2017-09-04 21:42 GMT+03:00 Adam Borowski : > On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: >> 2017-09-04 18:11 GMT+03:00 Adam Borowski : >> > Here's an utility to measure used compression type + ratio on a set of >> > files >> > or directories: https://github.com/kilobyte/compsize >> > >> > It should be of great help for users, and also if you: >> > * muck with compression levels >> > * add new compression types >> > * add heurestics that could err on withholding compression too much >> >> Packaged to AUR: >> https://aur.archlinux.org/packages/compsize-git/ > > Cool! I'd wait until people say the code is sane (I don't really know these > ioctls) but if you want to make poor AUR folks our beta testers, that's ok. This just are too handy =) > However, one issue: I did not set a license; your packaging says GPL3. > It would be better to have something compatible with btrfs-progs which are > GPL2-only. What about GPL2-or-higher? Sorry for license, just copy-paste error, fixed > After adding some related info (like wasted space in pinned extents, reuse > of extents), it'd be nice to have this tool inside btrfs-progs, either as a > part of "fi du" or another command. That will be useful =) P.S. your code work amazing fast on my ssd and data %) 150Gb data -O0 2.12s -O2 0.51s -- Have a nice day, Timofey. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: > 2017-09-04 18:11 GMT+03:00 Adam Borowski : > > Here's an utility to measure used compression type + ratio on a set of files > > or directories: https://github.com/kilobyte/compsize > > > > It should be of great help for users, and also if you: > > * muck with compression levels > > * add new compression types > > * add heurestics that could err on withholding compression too much > > Packaged to AUR: > https://aur.archlinux.org/packages/compsize-git/ Cool! I'd wait until people say the code is sane (I don't really know these ioctls) but if you want to make poor AUR folks our beta testers, that's ok. However, one issue: I did not set a license; your packaging says GPL3. It would be better to have something compatible with btrfs-progs which are GPL2-only. What about GPL2-or-higher? After adding some related info (like wasted space in pinned extents, reuse of extents), it'd be nice to have this tool inside btrfs-progs, either as a part of "fi du" or another command. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
2017-09-04 18:11 GMT+03:00 Adam Borowski : > Hi! > Here's an utility to measure used compression type + ratio on a set of files > or directories: https://github.com/kilobyte/compsize > > It should be of great help for users, and also if you: > * muck with compression levels > * add new compression types > * add heurestics that could err on withholding compression too much > > (Thanks for Knorrie and his python-btrfs project that made figuring out the > ioctls much easier.) > > Meow! > -- > ⢀⣴⠾⠻⢶⣦⠀ > ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? > ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din > ⠈⠳⣄ Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ -- Have a nice day, Timofey. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
\o/ compsize
Hi! Here's an utility to measure used compression type + ratio on a set of files or directories: https://github.com/kilobyte/compsize It should be of great help for users, and also if you: * muck with compression levels * add new compression types * add heurestics that could err on withholding compression too much (Thanks for Knorrie and his python-btrfs project that made figuring out the ioctls much easier.) Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html