Branch: refs/heads/yves/refactored_and_improved_mph_code Home: https://github.com/Perl/perl5 Commit: 34ae082d4674816a90dc8d8566dbfb66642d3a4b https://github.com/Perl/perl5/commit/34ae082d4674816a90dc8d8566dbfb66642d3a4b Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022)
Changed paths: M t/op/magic.t Log Message: ----------- op/magic.t - make $SIG{ALRM} local test deal with inherited IGNORE'd signal Signal ignores can be set up by a parent process and the child process will inherit them. So in some situations when we test $SIG{ALRM} might be "IGNORE" and not undef. So instead of just expecting undef, check what it was before the localization, and then check it is still that value afterwards. I found this while running make test via rebase --exec, something like this reduction (from Matthew Horsfall): $ git rebase --exec='perl -le"print \$SIG{ALRM}"' HEAD~1 IGNORE A similar case for a different signal would be this example (from Leon Timmermans): $ nohup perl -E 'say $SIG{HUP}' 2>/dev/null | cat IGNORE Commit: b645c3905b03d9b23f1505d8d9e28507abbab521 https://github.com/Perl/perl5/commit/b645c3905b03d9b23f1505d8d9e28507abbab521 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - remove unnecessary use of bignum We can detect we are on a 32 bit perl and then do the right thing. This *massively* speeds up the process. Hashing character by character with bignum enabled is painfully slow. Commit: 326fc1fe71030f5c78393bdb3b28d91767229155 https://github.com/Perl/perl5/commit/326fc1fe71030f5c78393bdb3b28d91767229155 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - document build_split_words() Explain the behavior and inputs and outputs of build_split_words() Commit: fe602516dcadefe6e00768a2e34fd24aecb6d644 https://github.com/Perl/perl5/commit/fe602516dcadefe6e00768a2e34fd24aecb6d644 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - refactor common sort routine out This sort expression gets repeated a lot and it is quite long, replace with a sub. Commit: 13cf268b64b9a2eb783e7a3415f661efb3987260 https://github.com/Perl/perl5/commit/13cf268b64b9a2eb783e7a3415f661efb3987260 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - remove unused arguments from build_split_words() These were vestigal from a previous implementation, no point in leaving them in, and they impact debugging. As part of this it make sense to rename %res to $res in build_scalar_words() as it makes it easier to make $old_res= $res when we retry. Commit: ded75ec1da80b1ce0281f04e4367fd5744624142 https://github.com/Perl/perl5/commit/ded75ec1da80b1ce0281f04e4367fd5744624142 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - dont use $key when dealing with $part of a key In build_split_words() $key is used for the main hash, use $part instead when we are dealing with part of a key (even if the $part might equal the $key). Commit: e4a66cbd997691729a9fdd0f79d9d9a31449c505 https://github.com/Perl/perl5/commit/e4a66cbd997691729a9fdd0f79d9d9a31449c505 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - dont null terminate in preprocess stage Removing the null termination allows us to save about 100 bytes from the final result. Commit: b82389bc5ed10a64ace577eacd610326e9dffbee https://github.com/Perl/perl5/commit/b82389bc5ed10a64ace577eacd610326e9dffbee Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - track added bytes while we process/check the blob It is possible that the $blob grows while we validate it, if so show how much it grew by and produce some diagnostics about it. We also track how many passes we have done. With this commit it is only used to make the new diagnostics a bit cleaner, but we will use it in more diagnostics later. Commit: 0f249cbd839eee5ff287b976b6efc243b3468927 https://github.com/Perl/perl5/commit/0f249cbd839eee5ff287b976b6efc243b3468927 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - do a bit less work in build_split_words() We do not need to check the split point of 0 or of the length of the string as we do not want empty prefixes (the 0 case) and we have already checked if the entire string is in the blob already in a earlier check (the length case). This also initializes the $best_prefix to be the full $key, and the $best_suffix to be the empty string just in case we cannot find any split point which would result in the variables being initialized. This prevents unitialized warnings when we track these variables in the %appended hash. Commit: fc226c94944ee1e13f77d662ab630f1ca59c5d33 https://github.com/Perl/perl5/commit/fc226c94944ee1e13f77d662ab630f1ca59c5d33 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - fix assignment to be consistent with other assignments. This file uses left hugging assignment operators, this was an exception. Commit: a4a147272f2eebb2a5ddc0c0f2978536e0fdf5bc https://github.com/Perl/perl5/commit/a4a147272f2eebb2a5ddc0c0f2978536e0fdf5bc Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - better diags in build_split_words() This also shows how effective the split process compression has been by computing what percentage the final blob is compared to the worst case of naively concatenating all the keys together. Commit: ddcbb84d060e952525bba7d5dd23dca83e3f33e1 https://github.com/Perl/perl5/commit/ddcbb84d060e952525bba7d5dd23dca83e3f33e1 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - rename var $b2 to $new_blob in build_split_words() $new_blob is more descriptive and less confusing. Commit: 4b216ad2c6a43c55bbfac0ed37269af26d3895e8 https://github.com/Perl/perl5/commit/4b216ad2c6a43c55bbfac0ed37269af26d3895e8 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - calculate $length_all_keys in make_mph_from_hash() Doing it in build_perfect_hash() just results in duplicated logic between build_split_words() and build_perfect_hash() Commit: 584e6129c31235d65e83a4221ed0fb456632782c https://github.com/Perl/perl5/commit/584e6129c31235d65e83a4221ed0fb456632782c Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - handle split data in make_mph_from_hash() Doing it in build_perfect_hash() does not make sense, we might want to use that function to build a hash that does not need to be split. Commit: 4397d13a25e3a25ecc163e7a8ab8052054ab62a5 https://github.com/Perl/perl5/commit/4397d13a25e3a25ecc163e7a8ab8052054ab62a5 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M charclass_invlists.h M lib/unicore/uni_keywords.pl M regen/mk_invlists.pl M uni_keywords.h Log Message: ----------- regen/mk_invlists.pl - move require to top of file mk_invlists.pl does a lot and takes a while before it gets to the part where it requires regen/mph.pl, which means that if there are issues in it they arent discovered until a fair amount of time elapses, which is frustrating when debugging. Moving the require to the top means the script dies early and can be fixed. Includes a regen of uni_keywords.h and friends as this changes a regen script which causes regen.t to fail if its output is not up to date. Commit: afa6102029038b2a972a9ed35251c04d78a49ea8 https://github.com/Perl/perl5/commit/afa6102029038b2a972a9ed35251c04d78a49ea8 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M charclass_invlists.h M lib/unicore/uni_keywords.pl M regen/mk_invlists.pl M uni_keywords.h Log Message: ----------- regen/mk_invlists.pl - move token_name() sub closer to where it is used sub token_name() was injected into the middle of totally unrelated logic that does not use it. token_name() is a wrapper around sanitize_name() so move it next to that sub. Also includes the output from running regen/mk_invlists.pl to keep porting/regen.t happy. Commit: 2c7654a7a3c2e57de403fbd3e7aa4045e29dbef6 https://github.com/Perl/perl5/commit/2c7654a7a3c2e57de403fbd3e7aa4045e29dbef6 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M charclass_invlists.h M lib/unicore/uni_keywords.pl M regen/mk_invlists.pl M uni_keywords.h Log Message: ----------- regen/mk_invlists.pl - add a way to dump the keywords hash for review This adds a way to tell mk_invlists.pl to dump the keywords hash so it can be reviewed, or used for testing or whatnot. A user can define the env var DUMP_KEYWORDS_FILE to be a file name which will be used to save the keywords hash to. If the env var is not set the file won't get written to disk. Includes regenerated output from running regen/mk_invlists.pl to keep porting/regen.t happy. Commit: 59eaa7f0d2ab45bacf6ca574657c566888add11a https://github.com/Perl/perl5/commit/59eaa7f0d2ab45bacf6ca574657c566888add11a Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - replace hard coded literal constants This replaces the various hard coded constants with either named constants, or vars. It also renames FNV_CONST to FNV32_PRIME in the generated code as that is its proper name. Commit: 47ea2eab61b932a203d99f2ade4f22962c30b330 https://github.com/Perl/perl5/commit/47ea2eab61b932a203d99f2ade4f22962c30b330 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - specify which fnv we are using This documents and specifies which FNV we are using. Since we are using fnv1a_32() it also replaces the use of $MASK and replaces it with U32_MAX. fnv1a_32() would never use anything other U32_MAX, even if $MASK changed, as obviously it is a 32 bit hash function. Commit: be3928ec7c14f800d9e1d61010403cfecb49d01d https://github.com/Perl/perl5/commit/be3928ec7c14f800d9e1d61010403cfecb49d01d Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - remove $max_h check in build_perfect_hash() Not sure why I put this in to the original code, it is a check that would be relevant to a PRNG but is not relevant to this use case. So this patch removes it. Commit: 782ef7ac52367f8d3b474a960eaec960b5aa1bf0 https://github.com/Perl/perl5/commit/782ef7ac52367f8d3b474a960eaec960b5aa1bf0 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - perltidy file for style consistency and document the perltidy options used so any future maintainers can follow the style of the file more easily. Commit: 88aaccdbb5a7e9ee31c33042fa54d47913c0239a https://github.com/Perl/perl5/commit/88aaccdbb5a7e9ee31c33042fa54d47913c0239a Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - remove unused var Commit: 671414647169ec5282063783313cadf905597b97 https://github.com/Perl/perl5/commit/671414647169ec5282063783313cadf905597b97 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - enable fatal warnings If this script warns then there is something very wrong and it should be fixed, so just die immediately. Especially as if it is broken it might just *spew* warnings, which is annoying. Commit: 9d258a7f618eab38e1756a7b2137d8fe3bef0898 https://github.com/Perl/perl5/commit/9d258a7f618eab38e1756a7b2137d8fe3bef0898 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - convert loop to use block form and add comment Using the comma operator to separate statments is fine on a one liner, but so much in a script that is part of perl regen processes. IMO. Commit: 85e69ad5747130fb935b756fb4a8afeeae9a00fe https://github.com/Perl/perl5/commit/85e69ad5747130fb935b756fb4a8afeeae9a00fe Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - add sanity check for idx and value parameters If either are missing then something has gone badly wrong and we should stop processing immediately. Commit: 5731ac49e533b9c3cedb4d8d320011f18b6fd053 https://github.com/Perl/perl5/commit/5731ac49e533b9c3cedb4d8d320011f18b6fd053 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - bucket info storage: remove 'hash', move 'value' logic The 'hash' key is totally unused and unneeded so drop it entirely. The 'value' key can be stored into the bucket info data elsewhere, strictly speaking it is not needed for the minimal perfect hash computations, so move it out so that logic can be changed and simplified in a future patch. Commit: 428f849d5eda3298bb8b982571ae67eba82b51ff https://github.com/Perl/perl5/commit/428f849d5eda3298bb8b982571ae67eba82b51ff Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - move bucket info construction log into a sub In a follow up patch this logic will get called from more than one place so move it to a sub so it can be used easily. Commit: 82a8cb7ac45feccd914c1e3ae6ca7abbe872c7f4 https://github.com/Perl/perl5/commit/82a8cb7ac45feccd914c1e3ae6ca7abbe872c7f4 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - change $hash argument to $source_hash The term 'hash' is overused in this code, rename the $hash argument to build_perfect_hash() to $source_hash to disambiguate. Commit: d646a7f47d9e053658b64b9341e15d0b540e6c00 https://github.com/Perl/perl5/commit/d646a7f47d9e053658b64b9341e15d0b540e6c00 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - eliminate the need to use goto The goto is confusing, and has the potential to introduce its own bugs if future changes are not careful, so get rid of it completely and break build_perfect_hash() into two subs. Commit: 2038878ece684aa1e68a2456fe0a8610d2ba4414 https://github.com/Perl/perl5/commit/2038878ece684aa1e68a2456fe0a8610d2ba4414 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - use more efficient logic in build_perfect_hash() This patch greatly reduces the amount of work we have to do to find a bucket for keys whose first level hash does not produce any collisions. We have two cases, those items whose $hash2 is higher than $MAX_SEED2 and those items whose $hash2 is smaller than or equal to $MAX_SEED2. For the former case we use a similar but streamlined process as we do for keys whose first level hash produces collisions. For the latter case we can trivially map the items into any bucket we choose. Commit: 69376b62b982de22c2fdc7cfc91c0e58e450c2a1 https://github.com/Perl/perl5/commit/69376b62b982de22c2fdc7cfc91c0e58e450c2a1 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl M uni_keywords.h Log Message: ----------- regen/mph.pl - Clean up diagnostics logic, allow DEBUG from env. Be silent unless requested to. If DEBUG>1 produce lots of output, if DEBUG==1 produce some basic information about what is going on. Commit: fd5fd3f854cb4c849bb6b8788d49bf4813a2daf9 https://github.com/Perl/perl5/commit/fd5fd3f854cb4c849bb6b8788d49bf4813a2daf9 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - fixup die that is issued if we can't solve this hash The old code was bit messed up, but I didn't notice because it doesn't happen with our current data. But in theory it could. It is possible that fnv1a has a multicollision vulnerability as it is not a secure hash, so changing the seed wouldn't help. For now we can assume it does not. Commit: 24da4c1b278e731d104f81901875f05c639928d3 https://github.com/Perl/perl5/commit/24da4c1b278e731d104f81901875f05c639928d3 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - change fnv1a_32() to _fnv1a_32() as it is not public This function is not part of the "public" API for this script/package and might be removed in the future if we needed to, so mark it as private with a leading underscore. Commit: f9c0d679da0e161557f6bacdc10debbc05c69a96 https://github.com/Perl/perl5/commit/f9c0d679da0e161557f6bacdc10debbc05c69a96 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - fixup wording in comment to be grammatical The original wording of this paragraph was a bit clumsy and not grammatically correct. This fixes it. Commit: dc8b69df5372cabbe882cb234e20f1b34c1773c8 https://github.com/Perl/perl5/commit/dc8b69df5372cabbe882cb234e20f1b34c1773c8 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - move split key logic out of make_mph_from_hash The logic of calling build_split_words() twice, once in "normal" mode and once in "preproces" mode does not belong in regen/mph.pl. So this patch renames build_split_words() to _build_split_words() and create a new sub called build_split_words() that implements the "call it twice" logic. For consistency the arguments are rearranged as well so the $preprocess argument is last, as build_split_words() now does not have a $preproces argument, as only _build_split_words() needs it. Commit: fc3ce90815623b76fd3f177e1c6dadbfe80889b5 https://github.com/Perl/perl5/commit/fc3ce90815623b76fd3f177e1c6dadbfe80889b5 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - rename $smart_blob to $blob across file $smart_blob is unhelpfully specific, rename to more generic $blob Commit: 2906a1d56e77bab3976dce1126e27d9bb6cb1778 https://github.com/Perl/perl5/commit/2906a1d56e77bab3976dce1126e27d9bb6cb1778 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - whitespace fixups This removes trailing whitespace only. No logic changes here at all. Commit: 3b1713f842b811f7d2cfde269d6f751b4019d1d6 https://github.com/Perl/perl5/commit/3b1713f842b811f7d2cfde269d6f751b4019d1d6 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - move file open logic to a sub this makes all the subs affected support a filehandle or filename Commit: bbd333e6eee3a8de0d88c08b7e2773d2fd1ef7a7 https://github.com/Perl/perl5/commit/bbd333e6eee3a8de0d88c08b7e2773d2fd1ef7a7 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - remove bogus defaulting for undef vars This was some kind of thinko and never made sense, prefix and suffix should never be undefined. Commit: fd87a368f60d2610f8be2f9d819fbffe183949fc https://github.com/Perl/perl5/commit/fd87a368f60d2610f8be2f9d819fbffe183949fc Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - use $n instead of @$second_level Same thing but $n is initialized one line above, so use it. Commit: 144ab7b4d8693e20877f9e4bbde4e7a372f72ab7 https://github.com/Perl/perl5/commit/144ab7b4d8693e20877f9e4bbde4e7a372f72ab7 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M regen/mph.pl Log Message: ----------- regen/mph.pl - tweaks to generated code, put type on its own line All the C code we have puts the type on its own line separate from the function parameter declaration, so follow that style in our generated file too. Also show the generator script in the comment that contains metadata about the file. Commit: 2d9dce49fe5b85a3b4d3be2538a016406be333c7 https://github.com/Perl/perl5/commit/2d9dce49fe5b85a3b4d3be2538a016406be333c7 Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M charclass_invlists.h M lib/unicore/uni_keywords.pl M regen/mk_invlists.pl M regen/mph.pl M uni_keywords.h Log Message: ----------- regen/mph.pl & mk_invlists.pl - convert from sub interfaces to OO interfaces The old sub based API was passing around an awkward number of arguments and it was becoming difficult to enhance in certain ways. This patch changes all the "user servicable" functions into methods, and moves the configuration defaults into the constructor. Note, not all the functions have been converted, the core routines with simple interfaces have not been changed. This is OO for the purpose of encapsulation not inheritance or overloading. Commit: 00729846bf0e20a2777db3aaffb715364efe5ece https://github.com/Perl/perl5/commit/00729846bf0e20a2777db3aaffb715364efe5ece Author: Yves Orton <demer...@gmail.com> Date: 2022-04-17 (Sun, 17 Apr 2022) Changed paths: M AUTHORS M charclass_invlists.h M lib/unicore/uni_keywords.pl M regen/mk_invlists.pl M regen/mph.pl M uni_keywords.h Log Message: ----------- regen/mph.pl & mk_invlists.pl - add the "_squeeze" algorithm to produce smaller blobs The squeeze algorithm produces smaller blobs, 10-20% depending on how it is used. With the "randomize_squeeze" option enabled it is slower but produces 20% smaller blobs than the "_simple" strategy we used to use. With the "randomize_squeeze" option disabled it is about as fast as "_simple" but produces about 10% smaller blobs. Regardless "_squeeze" uses more memory than _simple; quite a bit more currently, although that is unforced and could be changed if required. -blob length: 10548 +blob length: 8635 ... -data size: 69908 (%67.07) +data size: 67995 (%65.23) So it saves 1913 bytes running with this seed. I happened to get lucky with the seed, depending on the seed used the blob ended up about 8650 bytes. This algorithm is originally by Ilya Sashcheka, so I have added him to the AUTHORS file, but unfortunately I no longer have his email address as we lost touch. It contains many modifications by me. Compare: https://github.com/Perl/perl5/compare/34ae082d4674%5E...00729846bf0e