Hey everyone,
as part of our security research, we studied the attack surface of the
extract() function when being called with user-controlled input. Calling
extract($_GET) allows an attacker to overwrite all variables in the
current scope, which could lead to vulnerabilities. We analyzed the use
of extract() in 28325 open source projects on GitHub looking for cases
where extract() is called with user-controlled input. extract() was
called 4923 times and 146 of those calls involved user-input. The
majority of findings were vulnerable due to being able to overwrite
superglobals.
We would like to share the findings and start a discussion on if and how
the risks of unsafe extract() usage could be reduced without breaking
backwards compatibility. The vulnerable extract() calls could be
exploited primarily by overwriting superglobals who's state is expected
to be immutable by the user like $_SESSION or $_SERVER or empty like
$_POST on a GET request. Similar to how changing $GLOBALS through
extract() is already forbidden, we propose to block overwriting all
superglobals through extract(). This would prevent most of the
vulnerabilities found in the dataset and we cannot think of a valid use
case for allowing this behavior. Here is where we rely on your feedback
and experience.
The semantics of every EXTR_ flag need to be documented, considered, and
understood properly as invalid usage could still leave the application
vulnerable. Calling extract() in the global scope is particularly risky
and the warning in the documentation could be expanded to describe the
risks in more depth. Using EXTR_IF_EXISTS in the global scope for
example would still allow overwriting superglobals instead of the
documented intended use case of defining a list of valid variables to
extract beforehand.
Since empty prefix strings are allowed, even EXTR_PREFIX_ALL would allow
to overwrite superglobals since SESSION would become <empty
prefix>_SESSION -> _SESSION. We propose to disallow the use of empty
prefix strings in extract() to prevent this. No such code pattern was
found in the dataset, but we wanted to mention it since an empty prefix
string looks like a bug when using the EXTR_PREFIX_* flags.
You can find more details in our paper [0], especially the case studies
of vulnerable usage scenarios and possible exploits in chapter 5. We
propose further possible mitigations in section 6.4 that wouldn't fit in
this email. If the proposed mitigations are not feasible, we hope the
raw study results could be of use for further discussion on their own.
We are looking forward to your feedback and any discussion on this
topic. The proposed changes aim to improve the security of PHP
applications in the hopes to avoid breaking existing code.
Kind regards
Jannik Hartung
Institute for Application Security (IAS)
Technische Universität Braunschweig
[0] https://www.ias.tu-bs.de/publications/extract-woot-2025.pdf