2011.06.21 19:24 Reindl Harald rašė: > > > Am 21.06.2011 18:22, schrieb Ferenc Kovacs: >> On Tue, Jun 21, 2011 at 6:14 PM, Reindl Harald >> <h.rei...@thelounge.net>wrote: >> >>> >>> >>> Am 21.06.2011 17:55, schrieb Tomas Kuliavas: >>> >>>> They submit it in utf-8 only if your html form allows them to do that >>>> or >>>> they don't follow html specification and try to exploit your form. Set >>>> form input charset to iso-8859-1 and your nbspace will take only one >>> byte. >>> >>> and this naive attitude is the root of most security problems! >>> >>> why do you believe that every client submission is coming over >>> your form or generally over anything you can control? >>> >>> >> that doesn't matter here, Tomas just corrected John, that his statement >> that >> chrome will always use utf-8 encoding for some special character isn't >> true. >> browsers will adhere the >> http://www.w3.org/TR/html401/interact/forms.html#adef-accept-charset >> of course you can't trust user input, and you have to validate it, but >> this >> has nothing to do with this topic > > it has > > how du you validate input if the string-functions having undefined results > which you probably use for your validation?
I've never said that he should trust user input. I've only said that his valid user inputs depend on html form format. utf-8 is strict format. If you expect utf-8 and someone submits something else, you can tell that without any string function. You can verify utf-8 strings in pcre. You can convert nbspace to regular space, if you want. utf-8 does not have any byte sequence that can collide with nbspace byte sequence in utf-8. -- Tomas -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php