On 2021-01-07 11:00, Claude Pache wrote:

Le 6 janv. 2021 à 16:46, Nikita Popov <nikita....@gmail.com> a écrit :

On Sat, Dec 26, 2020 at 12:03 PM Craig Francis <cr...@craigfrancis.co.uk>
wrote:

Hi,

Could htmlspecialchars() use ENT_QUOTES by default?

I recently worked on an example script, where I tried to keep it simple by
using htmlspecialchars directly, e.g.

    echo "<img src='" . htmlspecialchars($url) . "'>";

I'd completely forgotten that single quotes are not escaped by default,
creating a XSS vulnerability, e.g.

    $url = "/' onerror='alert(1)";

All the common frameworks I could find use ENT_QUOTES to do this safely
(details below).

Christoph (cmb69) suggests this was done for HTML4 compatibility, with
older versions of PHP possibly having issues with numeric character
references (a quick search suggests PHP 5.4?).

PHP uses the numeric version &#039; with ENT_QUOTES, and it should continue
to do so - because the named version, &apos; was added in HTML5, but can
still cause problems with legacy parsers; for example Android 4, and the
one still in use by Microsoft Outlook (&amp;/&gt;/&lt; was in the
original HTML spec, and &quot; was added in HTML2).

I'd also be tempted to suggest ENT_SUBSTITUTE should be included, as I
prefer to keep as much of the valid data (rather than losing everything),
but that's not as important as escaping the apostrophe by default.

Craig




WordPress uses ENT_QUOTES (ish).

https://developer.wordpress.org/reference/functions/esc_html/

Laravel, with Blade, uses ENT_QUOTES:

https://github.com/illuminate/support/blob/master/helpers.php#L118

Symfony or Slim, with Twig, uses ENT_QUOTES | ENT_SUBSTITUTE:


https://github.com/twigphp/Twig/blob/3.x/src/Extension/EscaperExtension.php#L243

CodeIgniter uses ENT_QUOTES | ENT_SUBSTITUTE:


https://github.com/codeigniter4/CodeIgniter4/blob/develop/system/ThirdParty/Escaper/Escaper.php#L120

CakePHP uses ENT_QUOTES | ENT_SUBSTITUTE:

https://github.com/cakephp/cakephp/blob/master/src/Core/functions.php#L67

YII uses ENT_QUOTES | ENT_SUBSTITUTE:


https://github.com/yiisoft/yii2/blob/master/framework/helpers/BaseHtml.php#L111

Phalcon uses ENT_QUOTES:

https://github.com/phalcon/phalcon/blob/v5.0.x/src/Html/Escaper.php#L78

FuelPHP uses ENT_QUOTES:

https://github.com/fuel/core/blob/1.9/develop/config/config.php#L459

I agree that we should switch the default to ENT_QUOTES. I also agree that
we should enable ENT_SUBSTITUTE by default. I don't see any downside to
these two options.

Would you like to submit a PR?

Nikita

For ENT_SUBSTITUTE, there has been https://bugs.php.net/bug.php?id=69450 
<https://bugs.php.net/bug.php?id=69450>, but I don’t understand the objection 
in that bug report. Maybe there is some issue related to non-Unicode multibyte 
encodings?

—Claude

Only ISO-2022 encodings got bytes that can match symbols sanitized by htmlspecialchars.


Bug objection insist that utf-8 parsing rules should be enacted by sanitizing function and not by application which displays text. And PHP code is enacting those rules in most unfriendly API way.

--

Tomas

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to