Am So., 12. Sept. 2021 um 09:00 Uhr schrieb Jeff Pohlmeyer <yetanotherg...@gmail.com>: > > On Fri, Sep 10, 2021 at 7:52 PM Denys Vlasenko <vda.li...@googlemail.com> > wrote: > > I'm getting this: > > > (add/remove: 96/0 grow/shrink: 6/2 up/down: 24743/-98) Total: 24645 > > bytes
I can kick this down a bit by declaring all functions static, inlining and constant propagation does the rest. Using git/busybox as source for busybox (x86_64, gcc 10) GEN /tmp/build/Makefile function old new delta unpack_zstd_stream - 5070 +5070 static.HUF_readDTableX1_wksp_bmi2 - 1755 +1755 static.ZSTD_decompressBlock_internal - 1468 +1468 static.ZSTD_decompressSequences_body - 1429 +1429 ZSTD_decompressContinue - 1062 +1062 HUF_decompress4X1_usingDTable_internal_body - 883 +883 FSE_readNCount_body - 622 +622 ML_defaultDTable - 520 +520 LL_defaultDTable - 520 +520 static.ZSTD_buildFSETable_body - 518 +518 XXH64_digest - 494 +494 static.FSE_decompress_usingDTable_generic - 470 +470 ZSTD_getFrameHeader_advanced - 423 +423 static.XXH64_update_endian - 416 +416 ZSTD_decompressBegin_usingDDict - 391 +391 static.ZSTD_buildSeqTable - 375 +375 ZSTD_execSequenceEnd - 300 +300 OF_defaultDTable - 264 +264 ZSTD_decodeFrameHeader - 259 +259 BIT_initDStream - 258 +258 ZSTD_DCtx_selectFrameDDict - 234 +234 ZSTD_safecopy - 225 +225 ML_bits - 212 +212 ML_base - 212 +212 .rodata 98830 99029 +199 ZSTD_decompressContinueStream - 177 +177 LL_bits - 144 +144 LL_base - 144 +144 OF_bits - 128 +128 OF_base - 128 +128 unzstd_main - 126 +126 static.HUF_decodeStreamX1 - 117 +117 BIT_reloadDStream - 114 +114 ZSTD_overlapCopy8 - 107 +107 ZSTD_clearDict - 105 +105 ZSTD_frameHeaderSize_internal - 103 +103 HUF_decompress1X1_usingDTable_internal_body - 102 +102 ZSTD_wildcopy - 94 +94 static.unzstd_longopts - 81 +81 packed_usage 34120 34198 +78 ZSTD_getcBlockSize - 78 +78 tar_main 1290 1360 +70 FSE_decodeSymbolFast - 58 +58 BIT_reloadDStreamFast - 50 +50 setup_transformer_on_fd 155 204 +49 FSE_decodeSymbol - 44 +44 HUF_decodeSymbolX1 - 39 +39 BIT_readBits - 38 +38 ZSTD_initFseState - 34 +34 static.dec64table - 32 +32 static.dec32table - 32 +32 ZSTD_fcs_fieldSize - 32 +32 ZSTD_did_fieldSize - 32 +32 static.ZSTD_customFree - 27 +27 applet_main 3192 3216 +24 BIT_endOfDStream - 22 +22 applet_names 2747 2767 +20 repStartValue - 12 +12 tar_longopts 314 321 +7 static.CSWTCH - 6 +6 applet_suid 100 101 +1 applet_install_loc 200 201 +1 ------------------------------------------------------------------------------ (add/remove: 54/0 grow/shrink: 9/0 up/down: 21035/0) Total: 21035 bytes text data bss dec hex filename 999282 16443 1856 1017581 f86ed busybox_old 1020376 16467 1856 1038699 fd96b busybox_unstripped > > > I suspect Facebook et al do not share busybox's zeal about smaller size. Particularly some bullet points for zstd are speed, so that's a bit beside the point ;) Ideally we could define some macros to get there, I believe the simplest assumption is, that just no one cared enough to cleanly separate every option. > > I found this comment on github[1]: > "There is no new magic number planned in the foreseeable future. > 0xFD2FB528 is intended to be the only magic number for zstd frames." > > Do you think that implies that at least the basic file format is > probably stable? The format is documented and even publicized as rfc8878. Digging through the code I already found some spots adding code to ensure no data is produced that old (reference) implementations cant decode (ie. workaround for bugs). so going with the reference implementation should be rather safe. Still I think that being able to track upstream should be the best path. I did my own patch (some time ago, just took time to clean it up), as far as I can see some bits are there that are missing in Jeff's patch, the unzstd applet is a bit more feature full and behaves like the reference. The concept for upstream sources would be to use tools/scripts for most changes. (documented in README.source aswell). extending that, to say cut out comments or functions that aren't used (anything related to compression/dictionaries) should result in something making upstream syncs simpler and drop like 2/3 rds of lines. $zstd_path/contrib/freestanding_lib/freestanding.py \ --source-lib $zstd_path/lib \ --output-lib zstd \ -DZSTD_NO_INTRINSICS \ -DZSTD_NO_UNUSED_FUNCTIONS \ -DZSTD_LEGACY_SUPPORT=0 \ -DZSTD_STATIC_LINKING_ONLY \ -DFSE_STATIC_LINKING_ONLY \ -DHUF_STATIC_LINKING_ONLY \ -DXXH_STATIC_LINKING_ONLY \ -DZSTD_ADDRESS_SANITIZER=0 \ -DZSTD_MEMORY_SANITIZER=0 \ -UFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION \ -U__cplusplus \ -UZSTD_DLL_EXPORT \ -UZSTD_DLL_IMPORT \ -UZSTD_MULTITHREAD \ -RZSTDLIB_API=MEM_STATIC \ -RZSTDLIB_VISIBILITY=MEM_STATIC \ -RZSTDERRORLIB_VISIBILITY=MEM_STATIC \ -DZSTD_HAVE_WEAK_SYMBOLS=0 \ -DZSTD_TRACE=0 \ -DZSTD_NO_TRACE sed -e 's,^\([[:alnum:]_\*]* ERR_[[:alnum:]_]*\)(,static \1(,' \ -e 's,^\([[:alnum:]_\*]* FSE_[[:alnum:]_]*\) \?(,static \1(,' \ -e 's,^\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1(,' \ -e 's,^\([[:alnum:]_\*]* HUF_[[:alnum:]_]*\) \?(,static \1(,' \ -e 's,^\([[:alnum:]_\*]* HIST_[[:alnum:]_]*\)(,static \1(,' \ -e 's,^\(const \)\?\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1\2(,' \ -i zstd/*/*.h Norbert _______________________________________________ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox