Dave G wrote:
If that text is not properly validated and escaped, you could
be open to SQL Injection attacks
>
I'm less clear on what "properly escaped" means. I thought
escaping was a matter of putting slashes before special characters, so
that their presence doesn't confuse the SQL queries one might run. Is it
possible that if one has taken at least that much precaution that a user
could still enter malicious script held in a TEXT column?
Escaping the data so it's safe to put into a database query is only part
of the solution. It really depends on how the data goes into the query
how it should be escaped/validated, too.
If you have
WHERE id = $id
then you need to ensure $id is a number and only a number. 1, 100, 10.5,
-14.56 and 5.54E06 are all valid values for $id in this case.
is_integer(), is_numeric() and using (int), (float) to case values ($id
= (int)$id) help here.
If you have
WHERE name = '$name'
in the query, then you need to ensure any single quotes within $name are
escaped according to your database. MySQL uses backslashes, so you can
use addslashes() to escape the value of $name. Other database use
another single quote, so you need O''Kelly instead of O\'Kelly. To
further complicate things, you have to take into account the
magic_quotes_gpc setting. If that's enabled, PHP would have already
escaped any incoming GET/COOKIE/POST/REQUEST data using addslashes(). So
if you run addslashes() again, you're data will be escaped twice.
The thing to remember is that if you put O\'Kelly into the database, you
should be seeing O'Kelly inside the database when doing a SELECT. The \
is simply there to escape the quote upon executing the query. If you see
O\'Kelly actually in your database, then you're escaping your data
twice. If you find you have to use stripslashes() when you pull data
from your database (you shouldn't have to use it), then you're escaping
data twice OR you may have magic_quotes_runtime enabled (which will
escape data coming back out of databases and files, although this is off
by default).
If you have
WHERE "$name"
in your query, then you need to ensure double quotes are escaped within
$name. addslashes() and magic_quotes_gpc will take care of single and
double quotes, though, so you're covered there. A lot of people thing
that you only need to escape single quotes, but it really depends on how
you write your queries.
Now that the data is safely in the database, you'll eventually want to
display it back to the user, right? Again, you need to ensure the data
is escaped (or more properly - encoded) so that any HTML/JavaScript/etc
within the data is not rendered on your page (unless you really want it
to). If the data came from the user, then you DO NOT want it to render,
trust me.
Now, if you're validating everything to be a number or say 5 characters,
then there's no real malicious code that could be inserted to be
rendered on your page. However, the thing to realize is that, sure,
you're only allowing 5 character now. Tomorrow your partner comes along
and decides to allow 50 characters. He changes your substr() call to
chop it to 50 characters and changes the database column. Now, since you
weren't encoding the data before you displayed it back to the user, you
could be in trouble. The moral is that it really wouldn't hurt to encode
a string that you know will only be 5 characters just to cover things if
they ever change.
So how is this encoding done? htmlentities() is your best friend. When
you retrieve data from the database/file, you run it through
htmlentities() before putting it on your web page. So something like
supplied by the user will be sent as in the HTML
source. The user will actually see "" instead of an image box and a
possibly distasteful image.
Another use for htmlentities() is for when you display data back to the
user in a form element. This is pretty common for when you want
to redisplay a form with the data the user gave so they can edit it,
correct it, whatever. Normally, you'll see someone do this:
Well, what if the value of $name contains a double quote?
That "HTML" will confuse the browser. It'll see "a double" as the value
of the element and quote" as an unrecognized attribute. Now,
that doesn't really cause any harm, you just lose some text. But if the
user can supply a value beginning with "> (such as ">My HTML), then
just ended your element and anything after it will be rendered
as HTML.
My HTML">
Now you're letting them write any HTML/JavaScript/etc they want into
your page. This would allow them to inject JavaScript from a remote
site, redirect users, and steal cookie values. The PHP session id is
saved in a cookie. Once I have that session id, I can hijack your
session by providing the same session id when requesting a page on your
site.
Again, htmlentities() is your friend here. Using it on the value above
would give you this.
The """ will be in the HTML source, bu