Wietse Venema wrote:
> M. Sokolewicz:
>> (Wietse Venema) wrote:
>>> laurent jouanneau:
>>>> (Wietse Venema) wrote:
>>>>> To give an idea of the functionality, consider the following program
>>>>> with an obvious HTML injection bug:
>>>>>
>>>>> <?php
>>>>> $username = $_GET['username'];
>>>>> echo "Welcome back, $username\n";
>>>>> ?>
>>>>>
>>>>> With default .ini settings, this program does exactly what the
>>>>> programmer wrote: it echos the contents of the username request
>>>>> attribute, including all the malicious HTML code that an attacker
>>>>> may have supplied along with it.
>>>>>
>>>>> When I change one .ini setting:
>>>>>
>>>>> taint_error_level = E_WARNING
>>>>>
>>>>> the program produces the same output, but it also produces a warning:
>>>>>
>>>>> Warning: echo(): Argument contains data that is not converted
>>>>> with htmlspecialchars() or htmlentities() in /path/to/script
>>>>> on line 3
>>>> A PHP application doesn't always generate HTML : it can generate JSON,
>>>> CSV, PDF etc.. In this case, we don't have to call htmlspecialchars etc..
>>> In that case, I suppose you would not be using echo, so there
>>> is no problem.
>>>
>> You wouldn't? So, when outputting a script-generated pdf file, how would
>> you do that if not using echo? (and thus also not print since that's
>> pretty much the exact same thing)
>
> Never mind.
>
> The code that creates PDF produces data that carries no taint
> labels. The taint labels don't jump into existence spontaneously,
> they have to be added by whoever creates that data.
>
> Thus, because PDF is created without taint labels, the echo operator
> will not object to echoing it.
>
>>>> Is this warning appearing also when you want to output datas other than
>>>> HTML ? If no, how your code guess the output type ? If yes, how can we
>>>> disable this warning in pages which produce JSON etc. ?
>
> Guessing is something that we do in inferior software.
>
> Data carries taint labels that say what can't be done with it. If
> the PDF creator etc. does not label its output, then there are no
> restrictions.
This doesn't make much sense to me.
Consider very common (abbreviated) code like this:
$user_data = $_REQUEST['data'];
switch($output_format) {
case 'html':
echo "<html>$user_data</html>";
break;
case 'xml':
header('Content-type: text/xml');
echo "<xml>$user_data</xml>";
break;
case 'json':
header('Content-type: application/json');
echo json_encode(array($user_data));
break;
}
$user_data is going to be tainted, but the untainting rules are very
different for those 3 cases and popping up an error that talks about
html escaping only makes sense in the html case. That's part of what I
was talking about months ago when I talked about the problem with
context-less tainting.
-Rasmus
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php