stevenpall opened a new pull request, #65141: URL: https://github.com/apache/doris/pull/65141
On branch-4.1, `FromBase64Impl::vector` and `ToBase64Impl::vector` size the output scratch buffer as `cipher_len = srclen / 2`. base64 decode writes up to `srclen*3/4` bytes and encode writes up to `4*ceil(srclen/3)`, both larger than `srclen/2`. When `srclen/2` falls at or below `MAX_STACK_CIPHER_LEN` (64 KiB) while the real output exceeds 64 KiB, the write overflows the 64 KiB `stack_buf` and corrupts the stack frame, causing a delayed SIGSEGV inside `StringOP::push_value_string` (the output `ColumnString` PODArray reference is clobbered). It reproduces on valid, correctly padded base64 (input length a multiple of 4), so the length guard in #64788 does not prevent it. master is not affected. It was refactored to pre-reserve the output column at the true size (`total_size += len/4*3` for decode, `4*((len+2)/3)` for encode) and write directly, which removed the `srclen/2` scratch path. That change is not in any released 4.1.x, and the auto cherry-pick of #64788 to branch-4.1 is marked `dev/4.1.x-conflict`. This PR is the minimal sizing fix for the 4.1 line. Fix (matches `FromBase64BinaryImpl`, which already sizes correctly): ```cpp // FromBase64Impl::vector auto cipher_len = srclen / 4 * 3; // ToBase64Impl::vector auto cipher_len = (srclen + 2) / 3 * 4; ``` Verification: built stock be-4.1.2 with only this change and ran it in a production disaggregated cluster. A query decoding a 120000-character valid base64 value (90 KB output, inside the 64 KiB stack window) crashed the BE before and returns correctly after. Confirmed across all rows wider than 100 KB in a real table (19382 rows, max decode 612 KB) with no BE restarts. Repro on stock 4.1.2: ```sql SELECT length(from_base64(v)) FROM t; -- v: a 120000-char base64 string of a repeated byte ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
