On 21 Mar 2004 Chris Shiflett wrote:

> SQL injection vulnerabilities exist when you use data that the user gave
> you to create your SQL statement. So, anytime that this happens, simply
> make absolutely sure that the data you are using from the user fits a very
> specific format that you are expecting.

> To be clear: make sure the data that the user submitted only contains the
> characters you think are valid (don't bother trying to guess malicious
> characters - you're sure to miss one) and is a valid length. Once you've
> done this, and your design helps you to make sure that this step can't be
> bypassed by the user, you're protected against SQL injection.

Recently I've been in the middle of trying to build defenses against 
SQL injection on a site I'm working on (proactively, we haven't had a 
problem).  While this principle seems exactly right, I find it's not as 
easy to implement as it sounds, and I'd argue that the results aren't 
as absolute as you suggest, though you certainly have more experience 
with it than I do so perhaps I'm missing something.

Here's how I'm looking at it.

Pretty much any useful site tied to a database will use user data in 
SQL statements, either in WHERE clauses or SET clauses or both.  This 
means all input must be checked for maliciousness, and the primary 
kinds of malicious input seem to be SQL injection, or on another front 
HTML injection / XSS.

The problem is that there are some well-defined attacks with 
protections against them that can be logically defended.  But there is 
no list of all possible attacks, so I'm not sure it's really possible 
to say "you're protected against SQL injection" at some point.  Do you 
feel differently?  If so I'd be interested to hear why.

I agree with you that checking for valid characters is safer than 
checking for malicious characters, but even the former is not absolute. 
Also it is not possible to make the set of characters with syntactic 
significance have no overlap with the set of valid input characters -- 
a single quote used as an apostrophe is the obvious example, so 
checking for valid characters may still leave characters in the data 
that could also be part of an attack.

As for specifics, at the moment I am simply forcing every element of 
_POST to be truncated to a known maximum length, then run through 
strip_tags, stripslashes, and htmlspecialchars (in that order) before I 
use it.  Then every input form element is validated against an 
appropriate regexp depending on the type of input expected.  I also use 
mysql_real_escape_string on all strings prior to writing them to the 
database, and I use single quotes around all integer values.  If you're 
game, I'm curious if you see any flaws in this approach.  I am still 
contemplating whether there is any value to running input through 
htmlspecialchars, or whether I should instead simply be using 
htmlentities on output.  I also haven't looked at what this does to 
nested attacks of various kinds and whether there is a way to use 
multiple iterations or escapes in the input data to bypass the 
filtering (pointers to articles which discuss this would be welcome).

Thanks,

--
Tom

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to