@All,

I'd like to provide a real use case since i feel people have went off
on a tangent of their own. i.e: a list of blog posts.

<?php foreach($posts as $post): ?>
<p><a href="/blog/view/<?=$post->getID();?>"
title="<?=$escaper->escapeHtmlAttr($post->getTitle());?>">
    <?=$escaper->escapeHtml($post->getTitle());?>
</a></p>
<?php endforeach; ?>

Please see the different needs for escaping generalised html output,
and the same but within an attribute.
This is an important problem that we need to try and solve, the
htmlspecialchars() stuff isn't good enough else we wouldn't need
custom preg_match() solution like in the proposed RFC.

I'm happy for this to be a SPL class or a function such as
escape_var() with options on it (similar to how filter_var() works
right now). Adding additional extensions in todays PHP eco-system is
actually not going to help us at all since only like 2% of people are
ever going to install it. It has to be in ./ext/standard/ or
./ext/spl/.

@Rasmus/Stas
Are you happy with us adding a new class or function to ./ext/spl/ or
./ext/standard/. This isn't one of these shiny "must have" features,
it's actually addressing a very important problem.

For PHP developers to benefit from the escaping functions provided by
zend/symfony they have to actually be using those frameworks and
that's really a small portion of PHP code out there in the wild. If we
can introduce the new escape_var() function or a new OO class (as per
the RFC) then it's going to be readily available in the future.

Many thanks,
Paul Dragoonis.

On Tue, Sep 18, 2012 at 9:22 PM, Ángel González <keis...@gmail.com> wrote:
> On 18/09/12 21:06, Pádraic Brady wrote:
>> Hi Ángel,
>>
>> The methods all refer to literal strings, values or digits. We can't
>> reasonably escape data while allowing valid markup for the current
>> context since that's a contradiction by its very nature. If you needed
>> to let user values drive CSS names, Javascript functions or variable
>> naming, or HTML markup, you need something completely different. For
>> example, HTML markup can be sanitised against a whitelist using
>> HTMLPurifier.
>>
>>> I'm fine with the concept, but I'm not sold on the interface.
>>> It should be really clear when each of them should be used.
>>>
>>> escapeHtml()
>>> Ok, this is going to be used to show content inside a html document.
>>>
>>> escapeHtmlAttr()
>>> Use when using unquoted html attributes, otherwise use html escaping.
>>> When was the last time I saw an unquotted attribute with user-provided 
>>> content?
>> Hopefully never since that's the ideal ;). However, HTML5 allows
>> unquoted attributes which is perfectly valid. We don't make the user's
>> choice on this but we could provide the relevant tool for escaping if
>> they are completely and irredeemably insane :P.
> Someone may be insane enough to try to destroy his planet, but "some insane
> soul might want it" is no reason to build such weapon. :)
>
> As it's a crazy thing to do, we shouldn't provide means to do it. If
> your parameter
> is not a hardcoded number, just quote it and use escapeX function on its
> content.
>
>
>>> I think it should be replaced by a quoteHtmlAttr() function which properly
>>> escapes the content and adds the quotes for you (or it might skip them
>>> if it determines it's not needed in this case).
>> The RFC focuses on escaping - not sanitising or reformatting.
> As an api client I just want to pass a parameter to the attribute.
>
> Doing
>  echo '<b style="' . escaper->escapeHtml("font-weight: normal") . '">';
> or
>  echo '<b style=' . escaper->quoteHtmlAttrib("font-weight: normal") . '>';
>
> is equivalent, just a distinction on the function contract. But in the
> second case the function avoids the ambiguity on whether the attribute
> used double quotes, single ones or no quote at all, since it can choose
> the one it "prefers".
>
> The goal is to make easy to write secure code. I think the second way
> does it better. If we need to change the name of the rfc, so be it.
>
>
>>> escapeJs()
>>> Escape javascript... but inside <script> tags, I guess? So it's not to
>>> be used
>>> for dynamically generated javascript. Not so clear.
>> Javascript literal strings (as defined by the standard).
>
> Ok. We have the ' or " problem again, though.
>
>
>>> escapeCss()
>>> I'm not even sure in which cases would this be needed. Standalone CSS,
>>> inside
>>> a <style> tag, as style="" attribute?
>> CSS values like a font size or background color. If user data is
>> allowed to alter names or any other CSS markup, you would need
>> sanitisation rather than escaping.
> I was thinking in things like dynamic class names (I had no idea why you
> could
> want it, though :). It may be better named escapeCssValue()
>
>
>>> escapeUrl()
>>> "It is included primarily for consistency". When do I need to use
>>> escapeUrl and
>>> when escapeHtml? What if it's an url inside a css tag inside a html
>>> document?
>> Basically any URL inside any attribute. It encodes part of a URL - the
>> overall URL would still need to be validated separately.
>
> If it encodes *part of a url*, it's not for *any url*.
>
> By "any URL inside any attribute", I'd expect an usage like:
>
> echo '<a href="' . escaper->escapeHtml( escaper->escapeUrl(
> "https://wiki.php.net/rfc/escaper"; ) ) . '">See the rfc</a>';
>
>
> Of course, with the rawurlencode semantics, that
> https%3A%2F%2Fwiki.php.net%2Frfc%2Fescaper would be a relative url :)
> (passing a full url could be interesting for urlencoding non-ansi
> characters on the url, although most modern browsers deal
> fine with the raw bytes)
>
>
>>> It makes things more confusing, so I'd remove it.
>> Needs to be included to maintain consistency in having a full set of
>> go-to escapers.
> It could need renaming.
>
>
>
>>> It should be clear what you are passing to that function and in which
>>> context
>>> it expects you to leave the output.
>> It might not be obvious but these are very straightforward to link to
>> specific contexts. Here's the clearest explanation of where all of
>> this fits into templating:
>> https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
>>
>> I should probably add that as a link to the RFC (Anthony will finally
>> get an ESAPI reference out of me ;)).
>>
>> Paddy
> That's a document worth reading by everyone, but I still think the
> functions of the methods
> should be clearer from their names.
>
> Regards
>
>
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to