Am 15.06.2012 03:01, schrieb Anthony Ferrara:
I raised this topic on list over a year ago (
http://marc.info/?l=php-internals&m=130417646507744&w=2 ). It was
determined that it wasn't time yet to disable prepared statement
emulation for MySQL yet. However, Rasmus did mention that it was a
possibility for 5.4 (
http://marc.info/?l=php-internals&m=130418875017027&w=2 ). Since that
ship has sailed, I submitted a pull request for trunk to change the
default value of prepared statement emulation for MySQL.
https://github.com/php/php-src/pull/108
https://bugs.php.net/bug.php?id=54638
Does this need to be an RFC (should I draft one)? Or can it just be
pulled as-is?
Please, be aware of the consequences of this move and don't break any
tests. Also, please, do no break it down to "prepared statement ===
security" as any such _short_ statement draws an incomplete picture and
bares more risks than it does good.
Back then, Johannes already pointed out some consequences:
http://marc.info/?l=php-internals&m=130423522001744&w=2 . Here's another
iteration on the topic. There are assorted others in my blog from the
days PDO_MySQL has seen its last major update.
*Security*
It is claimed that native prepared statements are more secure because:
- statement and parameter are send to the server independently
- the server builds the final statement string
Whereas for a non-prepared statement:
- the client builds the final statement string
- the client has to escape parameter values
- the client sends statement and parameter in one
As long as client-side escaping is done properly, there is no practical
difference between the two approaches. There are, however, arguments
that boil down to limiting the possiblity of user errors:
- bind-style API
- messing up on charsets
A bind-style API may be convenient and prevent forgetting to use
escaping but does not prevent to shoot yourself doing something foolish
such as:
PDO::query("..." . $_REQUEST['killme']);
When using non-prepared statements it must be ensured to use the correct
charset for (client-side) escaping. As long as charset changes happen
through API calls, the client library is always aware of the current
charset and everything is fine. But, the charset can be changes through
SQL as well:
SET NAMES ...
The fact that PDO lacks API calls to set the charset, which enforces the
use of SET NAMES, is not helpful. However, even with appropriate API
calls users can always fool themselves using:
PDO::query(SET NAMES)
If building the final statement is done on the server or SET NAMES is
disallowed, fooling yourself becomes much harder.
*Security in the context of PDO*
Actually, the change of the default can give you a false feeling of
"improved security". This is due to the design of PDO and the existance
of a prepared statement emulation in the core of PDO.
A statement like this will use the prepared statement emulation:
SELECT something FROM table WHERE cond = :placeholder
MySQL does not support named bind parameter such as ":placeholder". The
PDO parser kicks in and does replace ":placeholder" with the bound
value. Which then, is no different from:
sprintf("SELECT something FROM table WHERE cond = '%s'", escape($param))
Of course, escape() can't be forgotten. It happens inside PHP, inside
PDO on the C level. Messing up with charsets is still possible. Thus,
users must be educated as ever since. The message "prepared statement
=== security" is too short to tell the full story.
BTW, there's the classic of:
SELECT something FROM table LIMIT :placeholder
What will happen if forget to hint a data type for :placeholer? Search
the bug database, if you can't answer. As you are at it, try the reverse
with ? and another database system that does not support ? as a
placeholder...
*Performance (and impact on server load)*
After switching to native prepared statements users may see an increase
of MySQL - PHP round trips:
prepare() client <--> server
execute() client <--> server
Great, if statements get executed multiple times. Bad, if statements are
not executed multiple times, such as:
SELECT title, received FROM news WHERE =
SELECT name, price, special_price FROM specials
Those statements will take two round trips, thus become slower. Things
become slower although there is no other win. Note, that none of the
example statements has any parameters.
A statement such as:
INSERT INTO test(col) VALUES (?)
May become faster if multiple rows are inserted. The statement string is
transferred only once during prepare. Later only the parameter value is
transferred. But, of course, you'd use MySQL multi-insert syntax or
other tricks here anyway, wouldn't you?
Back in the 70's, when prepared statements have been introduced, the
world was a different one. Prepared statements make the assumption that
its benefitial to "cache" various things, such as query execution plans,
inside the database server. Furhtermore, its assumed that a statement is
executed multiple times, making the caching a real benefit. MySQL is
from some 30 years thereafter:
- some parsing saved upon re-execution
- statements kept open for too long take short server resources
(server is always the hardest piece to scale-out!)
Keep in mind.
*Compatibility / Server support*
Recent versions of MySQL support preparing most statements,
http://dev.mysql.com/doc/refman/5.6/en/c-api-prepared-statements.html .
MySQL versions from the old days of when PDO was created did not.
Turning off the default emulation does not mean that statements will
stop to work. PDO_MySQL will fall back - as ever since - to the
emulation, if MySQL hints that it cannot prepare a statement. See above
on "false security". Whatever the default is, it does not free users to
understand the matter.
*Impact on data types*
Non-prepared and prepared statements use different flavours of the MySQL
protocol. Non-prepared, used with the PDO emulation, use the "text
protocol". Everything is returned as a string:
SELECT 1 AS _foo -> (your fetch code...) -> $_foo = "1"
Prepared statements use the "binary protocol". Native column types are
preserved in many cases:
SELECT 1 AS _foo -> (your fetch code...) -> $_foo = 1
Regarding the amount of wire traffic, there is hardly a difference
between the two flavours. The actual difference depends on the column
types. Nothing to bother about unless your website justifies to have a
dedicated MySQL development team to squeeze out the very last BIT....
Ulf
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php