dataroaring opened a new pull request, #61843:
URL: https://github.com/apache/doris/pull/61843
## Summary
Follow-up fixes from review of #61775 (cherry-pick of #60414):
- **Unbounded brace expansion (OOM risk)**: `expandBracePatterns()` fully
materializes all paths before checking `s3_head_request_max_paths`. Patterns
like `{1..100000}` or multi-brace cartesian products could cause high
CPU/memory usage. Added a limit-aware `expandBracePatterns(pattern, maxPaths)`
that stops expansion early via `BraceExpansionTooLargeException`, avoiding
large allocations.
- **Misleading glob metrics logs**: When the HEAD/getProperties optimization
succeeds and returns early, the `finally` block still logs LIST-path counters
(`elementCnt/matchCnt`) as 0. Added `usedHeadPath` flag to skip the LIST
metrics log when the HEAD optimization was used. This is especially important
for Azure where the log is at `INFO` level (always visible in production).
- **Unit tests**: Added 6 new test cases covering within-limit,
exactly-at-limit, one-over-limit, exceeds-limit, cartesian-exceeds, and
zero-means-unlimited scenarios.
Note: The timestamp issues (toEpochSecond/getSecond) are addressed
separately in #61790.
## Test plan
- [ ] Existing S3UtilTest passes
- [ ] New limit-aware expansion tests pass (boundary cases: exactly at
limit, one over limit, cartesian product)
- [ ] Verify S3 HEAD path fallback logs show info message when limit exceeded
- [ ] Verify Azure getProperties path fallback works correctly
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]