Rasmus Lerdorf wrote: > Sure, although if you are going to store the raw, I think it is > pointless to store the escaped version.
Yeah, I was thinking more of escaping data that is computationally expensive; such as bbcodes or wikitext => HTML. > I am not advocating storing it either way, I am simply saying that by > default you should never work with raw user data. [snip] If you forget to > fetch the raw > or if you forget to re-filter it through the appropriate filter for > whatever backend, then chances are your application won't work, or the > user will see strange output, but at least you will be failing safe, > instead of failing insecure. I understand that and how your methodology works, but I've always thought there was something fishy about it. I suppose this is the reason: the default won't always be secure, because HTML (and other formats too, I suppose) require a great variety of types of escaping. Say we're placing data in a href="" attribute; the default HTML escaping will protect against breaking out of the quote, but the user can still pass javascript:xss() and cause problems. There are two levels of escaping/validation that need to happen here: the HTML escaping, and a URI validation. The default can lull users into a false sense of security, especially for more subtle vectors, whereas if you force people to be explicit you've at least *attempted* to make them think about what output format they should be using. Either that, or make it so that the only way for a developer to output something like that is a manner that also supplies the context (for example, using a DOM builder). Of course, careless developers will still make careless mistakes, and I agree that a sensible default will fix the majority of these issues. Just not 100%. -- Edward Z. Yang GnuPG: 0x869C48DA HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php