[ 
https://issues.apache.org/jira/browse/COMPRESS-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18074486#comment-18074486
 ] 

Subbu commented on COMPRESS-723:
--------------------------------

Can I raise a fix for this?

> Harden TAR PAX header parsing: enforce memory bound to mitigate resource 
> exhaustion from oversized headers
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-723
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-723
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>            Reporter: Subbu
>            Priority: Major
>
> PAX extended header parsing in `TarUtils.parsePaxHeaders()` is the only 
> allocation path in commons-compress without an enforced, configurable memory 
> bound. The soft limit is set to `Long.MAX_VALUE`, which disables the 
> `MemoryLimitException` check entirely for this code path.
> This leaves applications that process untrusted TAR archives (CI/CD 
> pipelines, container registries, backup restoration) unable to enforce a 
> policy-driven cap on PAX header allocation. A crafted `.tar.gz` with a large 
> PAX header block (text that compresses at >1000:1 with gzip) can force 
> disproportionate heap consumption relative to its on-wire size. While an 
> implicit hard check against `totalMemory()` exists deeper in the call stack, 
> it is not an intentional security control and does not allow granular 
> configuration.
> *Solution*
> Enforce a configurable memory bound on PAX header parsing via a new 
> `maxPaxHeaderSize` builder option on `TarArchiveInputStream` and `TarFile`. 
> The default is 10 MB (`TarConstants.DEFAULT_MAX_PAX_HEADER_SIZE`), enforced 
> through the existing `MemoryLimitException.checkBytes()` mechanism. This 
> closes the last unbounded allocation surface in the TAR parsing pipeline and 
> follows the same defense-in-depth pattern already established for entry names 
> and 7z headers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to