My problem with this approach is that data is reformatted before being used, while my philosophy has always been to always store data in it's raw original version and format when outputting (which would always be consistent). So in this case, if someone (say in the forum of a website) starts using html, I would store the raw data in some database table.
This makes it example for certain users to say "I want to see html code in other people's posts" and others to say "I don't want to see their html code". In the first case I will output the text in a pretty raw form, and in the second case I will pass the data through htmlentities(). Personally, I think it's a bad idea to alter data before storing, because you can never go back to the way it was. If I store certain information, I store it raw. When I output it, I can choose how to reformat the data, because it's not always a HTML-based situation I'm going to be in. So yes, your approach is faster, but less flexible. My approach is consistent. I always store and handle my data raw, and when I output it, I consider reformatting. In your case you create exceptional situations where the default filter (which is server based, not application or website based) is not applicable. Problem is though that many people won't be able to rely on a default filter and therefor have to filter everything on input. And with the way I want to handle my data (only reformatting on output) I don't want to do any filtering at all on the input. Only on the output. It's a very weird idea to me to filter out HTML on input, because the only place where HTML tags could be abused is in the output. So that's where filtering should take place, imho. Maybe it's hard to figure out a way to do this the easiest way, but failing to come up with an output filtering idea should not result in input filtering "just because it's easier" (which, I'm very sorry to say reminds me once again of magic_quotes_gpc... it's much easier to define such a rule globally, but you end up with a lot of crap). And I don't mind writing "htmlentities()" all the time when I output data from my databases to a browser. You talk about a global policy, but a developer's policy should always include good security. So going over all code and add "htmlentities" will not happen to said developer. He has already done that while coding. Maybe if the name of "htmlentities" was only 4 characters like "echo", some people would be more eager to do output filtering from the start? ;) By the way, I use PHP for software development and I'm never in the position where a webserver admin would control what I can and cannot do, but I'm just anticipating trouble for people who are in that position. Ron "Rasmus Lerdorf" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Well, you already have the problem. The filter hook has been in PHP5 > for 2 years now and people are using it already. And yes, your code is > likely not to work on those servers if you are expecting raw html tags > to get through. > > There are plenty of people who have to operate under a mandated security > policy. That policy may state among many other things that no > user-originated raw html tags may ever be displayed. Now, if I gave you > that problem and a couple of thousand servers running millions of lines > of PHP code, how would you solve it? > > My solution is to block everything and then go and fix the few places > where raw tags are actually supposed to get through and make sure those > few places are validated correctly. > > You seem to be be indicating that you would go through every line of > code and make sure every single application did all validation correctly. > > Want to wager a guess at who would be done first? > > I am wide open to other approaches to solving this problem. > > -Rasmus > > Ron Korving wrote: > > Well there you go. A default filter. So I don't know what you mean with "For > > the 18th time, nobody is talking about enabling it by default.", because an > > administrator might. And I as a developer have no clue. Personally, I don't > > see why a webserver admin should need to secure his server through means of > > a default filter. There are good ways to secure a machine. This is not one > > of them You don't secure a server by setting a default that a user can > > override. So really, that is no argument. > > > > Like I said before. If a webserver admin dicatates the default way $_GET and > > $_POST data is perceived, a website developer has no choice but to use this > > filtering mechanism on every input variable he receives, because he just > > can't rely on PHP's default behaviour anymore. You see, not everybody agrees > > that you can't do without input filtering (myself for example), so in the > > end, there's no doubt in my mind that forcing a new magic default on > > PHP-users will make a lot of people unhappy. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php