[
https://issues.apache.org/jira/browse/COMPRESS-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18074486#comment-18074486
]
Subbu commented on COMPRESS-723:
--------------------------------
Can I raise a fix for this?
> Harden TAR PAX header parsing: enforce memory bound to mitigate resource
> exhaustion from oversized headers
> ----------------------------------------------------------------------------------------------------------
>
> Key: COMPRESS-723
> URL: https://issues.apache.org/jira/browse/COMPRESS-723
> Project: Commons Compress
> Issue Type: Bug
> Components: Archivers
> Reporter: Subbu
> Priority: Major
>
> PAX extended header parsing in `TarUtils.parsePaxHeaders()` is the only
> allocation path in commons-compress without an enforced, configurable memory
> bound. The soft limit is set to `Long.MAX_VALUE`, which disables the
> `MemoryLimitException` check entirely for this code path.
> This leaves applications that process untrusted TAR archives (CI/CD
> pipelines, container registries, backup restoration) unable to enforce a
> policy-driven cap on PAX header allocation. A crafted `.tar.gz` with a large
> PAX header block (text that compresses at >1000:1 with gzip) can force
> disproportionate heap consumption relative to its on-wire size. While an
> implicit hard check against `totalMemory()` exists deeper in the call stack,
> it is not an intentional security control and does not allow granular
> configuration.
> *Solution*
> Enforce a configurable memory bound on PAX header parsing via a new
> `maxPaxHeaderSize` builder option on `TarArchiveInputStream` and `TarFile`.
> The default is 10 MB (`TarConstants.DEFAULT_MAX_PAX_HEADER_SIZE`), enforced
> through the existing `MemoryLimitException.checkBytes()` mechanism. This
> closes the last unbounded allocation surface in the TAR parsing pipeline and
> follows the same defense-in-depth pattern already established for entry names
> and 7z headers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)