php-general Digest 5 Oct 2011 07:40:35 -0000 Issue 7505

Topics (messages 315128 through 315150):

Re: Secure data management
        315128 by: Stuart Dallas
        315129 by: Shawn McKenzie
        315130 by: Jim Giner
        315131 by: Jeremiah Dodds
        315132 by: Stuart Dallas
        315133 by: Stuart Dallas
        315134 by: Mark Kelly
        315135 by: Stuart Dallas
        315136 by: Tommy Pham
        315137 by: Stuart Dallas
        315138 by: Tommy Pham
        315139 by: Stuart Dallas
        315140 by: Tommy Pham
        315141 by: Jeremiah Dodds
        315142 by: Stuart Dallas
        315143 by: Tommy Pham
        315144 by: Jeremiah Dodds
        315145 by: Stuart Dallas
        315146 by: Jeremiah Dodds
        315147 by: Stuart Dallas
        315148 by: Tommy Pham
        315149 by: Jeremiah Dodds
        315150 by: Tommy Pham

Administrivia:

To subscribe to the digest, e-mail:
        php-general-digest-subscr...@lists.php.net

To unsubscribe from the digest, e-mail:
        php-general-digest-unsubscr...@lists.php.net

To post to the list, e-mail:
        php-gene...@lists.php.net


----------------------------------------------------------------------
--- Begin Message ---
On 4 Oct 2011, at 20:23, Jim Giner wrote:

> I thought I knew how to do this.
> 
> I have a form that collects some data fields.  My script checks if magic 
> quotes are off and (since they are) executes "addslashes" on each input 
> field.  Then I run a query to INSERT these 'slashed' vars into the database. 
> But when I go to phpadmin on my site the table does not contain any slashes.
> 
> Where are they going? 

1. Why are you using addslashes?

2. MySQL will strip one level of backslashes.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On 10/04/2011 02:23 PM, Jim Giner wrote:
> I thought I knew how to do this.
> 
> I have a form that collects some data fields.  My script checks if magic 
> quotes are off and (since they are) executes "addslashes" on each input 
> field.  Then I run a query to INSERT these 'slashed' vars into the database. 
> But when I go to phpadmin on my site the table does not contain any slashes.
> 
> Where are they going? 
> 
> 

The slashes escape "data" just to tell the database that those
characters are data.  The database doesn't insert the slash, that would
be unwanted.  Not all databases use the slash as an escape character and
for the ones that do you should use the X_real_escape_string(), like
mysql_real_escape_string() instead of addslashes()


-- 
Thanks!
-Shawn
http://www.spidean.com

--- End Message ---
--- Begin Message ---
"Stuart Dallas" <stu...@3ft9.com> wrote in message 
news:da8b3499-4d11-4053-9834-68b34d030...@3ft9.com...
1. Why are you using addslashes?

2. MySQL will strip one level of backslashes.
*********


I thought you were supposed to do an addslashes to protect your appl from 
malicious d/e.

Did not know that mysql drops the slashes. 



--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 2:44 PM, Jim Giner <jim.gi...@albanyhandball.com> wrote:

>
> I thought you were supposed to do an addslashes to protect your appl from
> malicious d/e.
>

To protect your app from malicious stuff going to SQL queries, you
should be using prepared statements, see
http://php.net/manual/en/pdo.prepared-statements.php

--- End Message ---
--- Begin Message ---
On 4 Oct 2011, at 20:30, Shawn McKenzie wrote:

> On 10/04/2011 02:23 PM, Jim Giner wrote:
>> I thought I knew how to do this.
>> 
>> I have a form that collects some data fields.  My script checks if magic 
>> quotes are off and (since they are) executes "addslashes" on each input 
>> field.  Then I run a query to INSERT these 'slashed' vars into the database. 
>> But when I go to phpadmin on my site the table does not contain any slashes.
>> 
>> Where are they going? 
>> 
>> 
> 
> The slashes escape "data" just to tell the database that those
> characters are data.  The database doesn't insert the slash, that would
> be unwanted.  Not all databases use the slash as an escape character and
> for the ones that do you should use the X_real_escape_string(), like
> mysql_real_escape_string() instead of addslashes()

http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On 4 Oct 2011, at 20:44, Jim Giner wrote:

> "Stuart Dallas" <stu...@3ft9.com> wrote in message 
> news:da8b3499-4d11-4053-9834-68b34d030...@3ft9.com...
> 1. Why are you using addslashes?
> 
> 2. MySQL will strip one level of backslashes.
> *********
> 
> 
> I thought you were supposed to do an addslashes to protect your appl from 
> malicious d/e.

Adding slashes to the data is nowhere near enough protection. Jeremiah is right 
in saying that prepared statements are the best option available at the moment.

> Did not know that mysql drops the slashes. 

I recommend that you look further into why you are doing things like that, 
especially when it's security-related. The more you know about what is 
happening and why the better your software will be.

In this particular case, the slashes are designed to mark quotes as part of the 
data and not the end of the data. For example...

    "this is an unescaped string containing " a quotation mark"

The MySQL parser will see the " in the middle and decide that that's the end of 
the data. However...

    "this is an escaped string containing \" a quotation mark"

The parser will see the \ before the " and that tells it the quote is part of 
the data. Because the \ is only there to tell it that it doesn't get left in 
the data when it's pushed into the database.

But escaping quotes (i.e. addslashes) is not enough to protect against SQL 
injection, and neither is mysql_real_escape_string as Shawn suggested. Prepared 
statements are the best option.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
Hi.

On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:

> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/

Thanks. I followed this link through and read the full message (having missed 
it the first time round), and while I find the idea of using base64 to 
sanitise text interesting I can also forsee a few difficulties:

It would prevent anyone from accessing the database directly and getting 
meaningful results unless the en/decode is in triggers, or maybe stored 
procedures. No more one-off command-line queries.

How would you search an encoded column for matching text?

I'd be interested in any ideas folk have about these issues, or any others 
they can envisage with this proposal.

Cheers,

Mark

--- End Message ---
--- Begin Message ---
On 5 Oct 2011, at 00:04, Mark Kelly wrote:

> Hi.
> 
> On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
> 
>> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
> 
> Thanks. I followed this link through and read the full message (having missed 
> it the first time round), and while I find the idea of using base64 to 
> sanitise text interesting I can also forsee a few difficulties:
> 
> It would prevent anyone from accessing the database directly and getting 
> meaningful results unless the en/decode is in triggers, or maybe stored 
> procedures. No more one-off command-line queries.
> 
> How would you search an encoded column for matching text?
> 
> I'd be interested in any ideas folk have about these issues, or any others 
> they can envisage with this proposal.

Base64 encoding will work when the native base64 functions are available in 
MySQL which will allow you to base64 encode the data into a statement like 
INSERT INTO table SET field = FROM_BASE64("<?php echo base64_encode($data); 
?>") sorta thing. I'm still not a massive fan of that idea given that prepared 
statements are an option, but it would work.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:

> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
>
> > Hi.
> >
> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
> >
> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
> >
> > Thanks. I followed this link through and read the full message (having
> missed
> > it the first time round), and while I find the idea of using base64 to
> > sanitise text interesting I can also forsee a few difficulties:
> >
> > It would prevent anyone from accessing the database directly and getting
> > meaningful results unless the en/decode is in triggers, or maybe stored
> > procedures. No more one-off command-line queries.
> >
> > How would you search an encoded column for matching text?
> >
> > I'd be interested in any ideas folk have about these issues, or any
> others
> > they can envisage with this proposal.
>
> Base64 encoding will work when the native base64 functions are available in
> MySQL which will allow you to base64 encode the data into a statement like
> INSERT INTO table SET field = FROM_BASE64("<?php echo base64_encode($data);
> ?>") sorta thing. I'm still not a massive fan of that idea given that
> prepared statements are an option, but it would work.
>
> -Stuart
>
> --
> Stuart Dallas
> 3ft9 Ltd
> http://3ft9.com/
> --
>
>
Inserting and updating isn't the problem.  I think Mark referring to is how
would that be implemented in this simple type of query:

SELECT * FROM my_table WHERE col_name LIKE '%key word%';

If there's no viable mean to filter the data, that storage method/medium is
rather pointless, IMHO.

--- End Message ---
--- Begin Message ---
On 5 Oct 2011, at 00:45, Tommy Pham wrote:

> On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:
> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
> 
> > Hi.
> >
> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
> >
> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
> >
> > Thanks. I followed this link through and read the full message (having 
> > missed
> > it the first time round), and while I find the idea of using base64 to
> > sanitise text interesting I can also forsee a few difficulties:
> >
> > It would prevent anyone from accessing the database directly and getting
> > meaningful results unless the en/decode is in triggers, or maybe stored
> > procedures. No more one-off command-line queries.
> >
> > How would you search an encoded column for matching text?
> >
> > I'd be interested in any ideas folk have about these issues, or any others
> > they can envisage with this proposal.
> 
> Base64 encoding will work when the native base64 functions are available in 
> MySQL which will allow you to base64 encode the data into a statement like 
> INSERT INTO table SET field = FROM_BASE64("<?php echo base64_encode($data); 
> ?>") sorta thing. I'm still not a massive fan of that idea given that 
> prepared statements are an option, but it would work.
> 
> -Stuart
> 
> --
> Stuart Dallas
> 3ft9 Ltd
> http://3ft9.com/
> --
> 
> 
> Inserting and updating isn't the problem.  I think Mark referring to is how 
> would that be implemented in this simple type of query:
> 
> SELECT * FROM my_table WHERE col_name LIKE '%key word%';
> 
> If there's no viable mean to filter the data, that storage method/medium is 
> rather pointless, IMHO.

Go back and read what I wrote again. Base64 is only being used to transmit the 
data to MySQL - it's being stored in the database in its decoded form.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 4:49 PM, Stuart Dallas <stu...@3ft9.com> wrote:

>
> On 5 Oct 2011, at 00:45, Tommy Pham wrote:
>
> On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>
>> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
>>
>> > Hi.
>> >
>> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
>> >
>> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
>> >
>> > Thanks. I followed this link through and read the full message (having
>> missed
>> > it the first time round), and while I find the idea of using base64 to
>> > sanitise text interesting I can also forsee a few difficulties:
>> >
>> > It would prevent anyone from accessing the database directly and getting
>> > meaningful results unless the en/decode is in triggers, or maybe stored
>> > procedures. No more one-off command-line queries.
>> >
>> > How would you search an encoded column for matching text?
>> >
>> > I'd be interested in any ideas folk have about these issues, or any
>> others
>> > they can envisage with this proposal.
>>
>> Base64 encoding will work when the native base64 functions are available
>> in MySQL which will allow you to base64 encode the data into a statement
>> like INSERT INTO table SET field = FROM_BASE64("<?php echo
>> base64_encode($data); ?>") sorta thing. I'm still not a massive fan of that
>> idea given that prepared statements are an option, but it would work.
>>
>> -Stuart
>>
>> --
>> Stuart Dallas
>> 3ft9 Ltd
>> http://3ft9.com/
>> --
>>
>>
> Inserting and updating isn't the problem.  I think Mark referring to is how
> would that be implemented in this simple type of query:
>
> SELECT * FROM my_table WHERE col_name LIKE '%key word%';
>
> If there's no viable mean to filter the data, that storage method/medium is
> rather pointless, IMHO.
>
>
> Go back and read what I wrote again. Base64 is only being used to transmit
> the data to MySQL - it's being stored in the database in its decoded form.
>
> -Stuart
>
> --
> Stuart Dallas
> 3ft9 Ltd
> http://3ft9.com/
>

The question still applies as how would you safeguard that 'key word'
transmission, especially against SQL injection.  I suppose one could do it
this way:

SELECT * FROM my_table WHERE col_name LIKE CONCAT('%', FROM_BASE64("<?php
echo base64_encode($data); ?>"), '%')

Is the overhead worth it to warrant that kind of safeguard?  That's just a
simple query with a simple search criteria.  What about in the case of
subselect and multi-table joins?

--- End Message ---
--- Begin Message ---
On 5 Oct 2011, at 01:13, Tommy Pham wrote:

> On Tue, Oct 4, 2011 at 4:49 PM, Stuart Dallas <stu...@3ft9.com> wrote:
> 
> On 5 Oct 2011, at 00:45, Tommy Pham wrote:
> 
>> On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
>> 
>> > Hi.
>> >
>> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
>> >
>> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
>> >
>> > Thanks. I followed this link through and read the full message (having 
>> > missed
>> > it the first time round), and while I find the idea of using base64 to
>> > sanitise text interesting I can also forsee a few difficulties:
>> >
>> > It would prevent anyone from accessing the database directly and getting
>> > meaningful results unless the en/decode is in triggers, or maybe stored
>> > procedures. No more one-off command-line queries.
>> >
>> > How would you search an encoded column for matching text?
>> >
>> > I'd be interested in any ideas folk have about these issues, or any others
>> > they can envisage with this proposal.
>> 
>> Base64 encoding will work when the native base64 functions are available in 
>> MySQL which will allow you to base64 encode the data into a statement like 
>> INSERT INTO table SET field = FROM_BASE64("<?php echo base64_encode($data); 
>> ?>") sorta thing. I'm still not a massive fan of that idea given that 
>> prepared statements are an option, but it would work.
>> 
>> -Stuart
>> 
>> --
>> Stuart Dallas
>> 3ft9 Ltd
>> http://3ft9.com/
>> --
>> 
>> 
>> Inserting and updating isn't the problem.  I think Mark referring to is how 
>> would that be implemented in this simple type of query:
>> 
>> SELECT * FROM my_table WHERE col_name LIKE '%key word%';
>> 
>> If there's no viable mean to filter the data, that storage method/medium is 
>> rather pointless, IMHO.
> 
> Go back and read what I wrote again. Base64 is only being used to transmit 
> the data to MySQL - it's being stored in the database in its decoded form.
> 
> -Stuart
> 
> -- 
> Stuart Dallas
> 3ft9 Ltd
> http://3ft9.com/
> 
> The question still applies as how would you safeguard that 'key word' 
> transmission, especially against SQL injection.  I suppose one could do it 
> this way:
> 
> SELECT * FROM my_table WHERE col_name LIKE CONCAT('%', FROM_BASE64("<?php 
> echo base64_encode($data); ?>"), '%')
> 
> Is the overhead worth it to warrant that kind of safeguard?  That's just a 
> simple query with a simple search criteria.  What about in the case of 
> subselect and multi-table joins?

That would indeed be logical if base64 was your chosen method of protection, 
but I think prepared statements are a far more elegant solution. As for the 
overhead I very much doubt there's much difference between that and the 
overhead of prepared statements.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 5:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:

> On 5 Oct 2011, at 01:13, Tommy Pham wrote:
>
> On Tue, Oct 4, 2011 at 4:49 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>
>>
>> On 5 Oct 2011, at 00:45, Tommy Pham wrote:
>>
>> On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>>
>>> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
>>>
>>> > Hi.
>>> >
>>> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
>>> >
>>> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
>>> >
>>> > Thanks. I followed this link through and read the full message (having
>>> missed
>>> > it the first time round), and while I find the idea of using base64 to
>>> > sanitise text interesting I can also forsee a few difficulties:
>>> >
>>> > It would prevent anyone from accessing the database directly and
>>> getting
>>> > meaningful results unless the en/decode is in triggers, or maybe stored
>>> > procedures. No more one-off command-line queries.
>>> >
>>> > How would you search an encoded column for matching text?
>>> >
>>> > I'd be interested in any ideas folk have about these issues, or any
>>> others
>>> > they can envisage with this proposal.
>>>
>>> Base64 encoding will work when the native base64 functions are available
>>> in MySQL which will allow you to base64 encode the data into a statement
>>> like INSERT INTO table SET field = FROM_BASE64("<?php echo
>>> base64_encode($data); ?>") sorta thing. I'm still not a massive fan of that
>>> idea given that prepared statements are an option, but it would work.
>>>
>>> -Stuart
>>>
>>> --
>>> Stuart Dallas
>>> 3ft9 Ltd
>>> http://3ft9.com/
>>> --
>>>
>>>
>> Inserting and updating isn't the problem.  I think Mark referring to is
>> how would that be implemented in this simple type of query:
>>
>> SELECT * FROM my_table WHERE col_name LIKE '%key word%';
>>
>> If there's no viable mean to filter the data, that storage method/medium
>> is rather pointless, IMHO.
>>
>>
>> Go back and read what I wrote again. Base64 is only being used to transmit
>> the data to MySQL - it's being stored in the database in its decoded form.
>>
>> -Stuart
>>
>> --
>> Stuart Dallas
>> 3ft9 Ltd
>> http://3ft9.com/
>>
>
> The question still applies as how would you safeguard that 'key word'
> transmission, especially against SQL injection.  I suppose one could do it
> this way:
>
> SELECT * FROM my_table WHERE col_name LIKE CONCAT('%', FROM_BASE64("<?php
> echo base64_encode($data); ?>"), '%')
>
> Is the overhead worth it to warrant that kind of safeguard?  That's just a
> simple query with a simple search criteria.  What about in the case of
> subselect and multi-table joins?
>
>
> That would indeed be logical if base64 was your chosen method of
> protection, but I think prepared statements are a far more elegant solution.
> As for the overhead I very much doubt there's much difference between that
> and the overhead of prepared statements.
>
> -Stuart
>
> --
> Stuart Dallas
> 3ft9 Ltd
> http://3ft9.com/
>

IIRC, prepared statements doesn't incur any overhead.  Instead, it's
supposed to enhance performance by telling SQL to 'prepare' via
compilation.  So if you're comparing performance between the overhead of
base64 vs prepared statement, then the difference would be quite clear,
especially when the table(s) is/are more than a couple hundred thousand rows
and the queri(es) are complex.  This is not mention the added complexity
into the application where managing and expanding it would incur real
(developer time) overhead, IMO.

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 7:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>  As for the overhead I very much doubt there's much difference between that 
> and the overhead of prepared statements.

Probably not. As an aside, I'm really struggling to find a case where
it'd be worth base64-encoding the queries like that unless you were
both concerned about someone sniffing your queries over the wire and
sure that they wouldn't think to base-64 decode them. Not to mention
that if your grand idea to prevent eavesdropping is simple transforms,
you've got a larger problem on your hands.

It *will* work, as mysql's base64 decoder won't evaluate the decoded
string as a statement, afaik, but it will also expand the size of
stuff by around 30% while having a, imo, much better solution widely
available.

--- End Message ---
--- Begin Message ---
On 5 Oct 2011, at 02:02, Tommy Pham wrote:

> On Tue, Oct 4, 2011 at 5:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:
> On 5 Oct 2011, at 01:13, Tommy Pham wrote:
> 
>> On Tue, Oct 4, 2011 at 4:49 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>> 
>> On 5 Oct 2011, at 00:45, Tommy Pham wrote:
>> 
>>> On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>>> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
>>> 
>>> > Hi.
>>> >
>>> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
>>> >
>>> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
>>> >
>>> > Thanks. I followed this link through and read the full message (having 
>>> > missed
>>> > it the first time round), and while I find the idea of using base64 to
>>> > sanitise text interesting I can also forsee a few difficulties:
>>> >
>>> > It would prevent anyone from accessing the database directly and getting
>>> > meaningful results unless the en/decode is in triggers, or maybe stored
>>> > procedures. No more one-off command-line queries.
>>> >
>>> > How would you search an encoded column for matching text?
>>> >
>>> > I'd be interested in any ideas folk have about these issues, or any others
>>> > they can envisage with this proposal.
>>> 
>>> Base64 encoding will work when the native base64 functions are available in 
>>> MySQL which will allow you to base64 encode the data into a statement like 
>>> INSERT INTO table SET field = FROM_BASE64("<?php echo base64_encode($data); 
>>> ?>") sorta thing. I'm still not a massive fan of that idea given that 
>>> prepared statements are an option, but it would work.
>>> 
>>> 
>>> Inserting and updating isn't the problem.  I think Mark referring to is how 
>>> would that be implemented in this simple type of query:
>>> 
>>> SELECT * FROM my_table WHERE col_name LIKE '%key word%';
>>> 
>>> If there's no viable mean to filter the data, that storage method/medium is 
>>> rather pointless, IMHO.
>> 
>> Go back and read what I wrote again. Base64 is only being used to transmit 
>> the data to MySQL - it's being stored in the database in its decoded form.
>> 
>> 
>> The question still applies as how would you safeguard that 'key word' 
>> transmission, especially against SQL injection.  I suppose one could do it 
>> this way:
>> 
>> SELECT * FROM my_table WHERE col_name LIKE CONCAT('%', FROM_BASE64("<?php 
>> echo base64_encode($data); ?>"), '%')
>> 
>> Is the overhead worth it to warrant that kind of safeguard?  That's just a 
>> simple query with a simple search criteria.  What about in the case of 
>> subselect and multi-table joins?
> 
> That would indeed be logical if base64 was your chosen method of protection, 
> but I think prepared statements are a far more elegant solution. As for the 
> overhead I very much doubt there's much difference between that and the 
> overhead of prepared statements.
> 
> 
> IIRC, prepared statements doesn't incur any overhead.  Instead, it's supposed 
> to enhance performance by telling SQL to 'prepare' via compilation.  So if 
> you're comparing performance between the overhead of base64 vs prepared 
> statement, then the difference would be quite clear, especially when the 
> table(s) is/are more than a couple hundred thousand rows and the queri(es) 
> are complex.  This is not mention the added complexity into the application 
> where managing and expanding it would incur real (developer time) overhead, 
> IMO.

Prepared statements incur an additional hit against the DB server to prepare 
the statement.

The cost of using base64 in the manner suggested is minimal, regardless of the 
size of the data. The MySQL query analyser is intelligent enough to know that 
from_base64('xyz') is a constant expression and will therefore only evaluate it 
once.

As for the added complexity, if you have SQL statements all over your code then 
yes it will add a time overhead, but any codebase of a significant size should 
be using a centralised API for database access such that changes like this have 
a very limited scope.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 6:07 PM, Jeremiah Dodds <jeremiah.do...@gmail.com>wrote:

> On Tue, Oct 4, 2011 at 7:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:
> >  As for the overhead I very much doubt there's much difference between
> that and the overhead of prepared statements.
>
> Probably not. As an aside, I'm really struggling to find a case where
> it'd be worth base64-encoding the queries like that unless you were
> both concerned about someone sniffing your queries over the wire and
> sure that they wouldn't think to base-64 decode them. Not to mention
> that if your grand idea to prevent eavesdropping is simple transforms,
>

If that's the case, then SSL would be a better solution since it also
protects the authentication process.  In then end, I still don't see base64
as a viable solution.


> you've got a larger problem on your hands.
>
> It *will* work, as mysql's base64 decoder won't evaluate the decoded
> string as a statement, afaik, but it will also expand the size of
> stuff by around 30% while having a, imo, much better solution widely
> available.
>
>

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 8:10 PM, Stuart Dallas <stu...@3ft9.com> wrote:
> Prepared statements incur an additional hit against the DB server to prepare 
> the statement.

But only once, right? This could, of course, still be a downside
depending the nature of your app.

--- End Message ---
--- Begin Message ---
On 5 Oct 2011, at 02:07, Jeremiah Dodds wrote:

> On Tue, Oct 4, 2011 at 7:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>> As for the overhead I very much doubt there's much difference between that 
>> and the overhead of prepared statements.
> 
> Probably not. As an aside, I'm really struggling to find a case where
> it'd be worth base64-encoding the queries like that unless you were
> both concerned about someone sniffing your queries over the wire and
> sure that they wouldn't think to base-64 decode them. Not to mention
> that if your grand idea to prevent eavesdropping is simple transforms,
> you've got a larger problem on your hands.

I don't see a reason to use base64 to solve the SQL injection problem either, 
especially with prepared statements available, but that doesn't mean it won't 
work.

As far as protecting data during transit, that's what SSL is for. Base64 is not 
an encryption mechanism.

> It *will* work, as mysql's base64 decoder won't evaluate the decoded
> string as a statement, afaik, but it will also expand the size of
> stuff by around 30% while having a, imo, much better solution widely
> available.

It will indeed increase the size of the queries, but unless you're running 
Facebook, LAN capacity is very rarely a bottleneck.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 8:15 PM, Tommy Pham <tommy...@gmail.com> wrote:
> On Tue, Oct 4, 2011 at 6:07 PM, Jeremiah Dodds <jeremiah.do...@gmail.com>
> wrote:
>>
>> On Tue, Oct 4, 2011 at 7:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>> >  As for the overhead I very much doubt there's much difference between
>> > that and the overhead of prepared statements.
>>
>> Probably not. As an aside, I'm really struggling to find a case where
>> it'd be worth base64-encoding the queries like that unless you were
>> both concerned about someone sniffing your queries over the wire and
>> sure that they wouldn't think to base-64 decode them. Not to mention
>> that if your grand idea to prevent eavesdropping is simple transforms,
>
> If that's the case, then SSL would be a better solution since it also
> protects the authentication process.  In then end, I still don't see base64
> as a viable solution.

*nods*.

I didn't mention encryption because I figured it was the obvious solution there.

--- End Message ---
--- Begin Message ---
On 5 Oct 2011, at 02:16, Jeremiah Dodds wrote:

> On Tue, Oct 4, 2011 at 8:10 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>> Prepared statements incur an additional hit against the DB server to prepare 
>> the statement.
> 
> But only once, right? This could, of course, still be a downside
> depending the nature of your app.

Once per statement per request, yes. Most web apps execute one-off statements 
during each request, so the ability to reuse a prepared statement is not a 
helpful feature for that environment.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 6:10 PM, Stuart Dallas <stu...@3ft9.com> wrote:

>
> On 5 Oct 2011, at 02:02, Tommy Pham wrote:
>
> On Tue, Oct 4, 2011 at 5:51 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>
>> On 5 Oct 2011, at 01:13, Tommy Pham wrote:
>>
>> On Tue, Oct 4, 2011 at 4:49 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>>
>>>
>>> On 5 Oct 2011, at 00:45, Tommy Pham wrote:
>>>
>>> On Tue, Oct 4, 2011 at 4:11 PM, Stuart Dallas <stu...@3ft9.com> wrote:
>>>
>>>> On 5 Oct 2011, at 00:04, Mark Kelly wrote:
>>>>
>>>> > Hi.
>>>> >
>>>> > On Tuesday 04 Oct 2011 at 21:39 Stuart Dallas wrote:
>>>> >
>>>> >> http://stut.net/2011/09/15/mysql-real-escape-string-is-not-enough/
>>>> >
>>>> > Thanks. I followed this link through and read the full message (having
>>>> missed
>>>> > it the first time round), and while I find the idea of using base64 to
>>>> > sanitise text interesting I can also forsee a few difficulties:
>>>> >
>>>> > It would prevent anyone from accessing the database directly and
>>>> getting
>>>> > meaningful results unless the en/decode is in triggers, or maybe
>>>> stored
>>>> > procedures. No more one-off command-line queries.
>>>> >
>>>> > How would you search an encoded column for matching text?
>>>> >
>>>> > I'd be interested in any ideas folk have about these issues, or any
>>>> others
>>>> > they can envisage with this proposal.
>>>>
>>>> Base64 encoding will work when the native base64 functions are available
>>>> in MySQL which will allow you to base64 encode the data into a statement
>>>> like INSERT INTO table SET field = FROM_BASE64("<?php echo
>>>> base64_encode($data); ?>") sorta thing. I'm still not a massive fan of that
>>>> idea given that prepared statements are an option, but it would work.
>>>>
>>>>
>>> Inserting and updating isn't the problem.  I think Mark referring to is
>>> how would that be implemented in this simple type of query:
>>>
>>> SELECT * FROM my_table WHERE col_name LIKE '%key word%';
>>>
>>> If there's no viable mean to filter the data, that storage method/medium
>>> is rather pointless, IMHO.
>>>
>>>
>>> Go back and read what I wrote again. Base64 is only being used to
>>> transmit the data to MySQL - it's being stored in the database in its
>>> decoded form.
>>>
>>>
>> The question still applies as how would you safeguard that 'key word'
>> transmission, especially against SQL injection.  I suppose one could do it
>> this way:
>>
>> SELECT * FROM my_table WHERE col_name LIKE CONCAT('%', FROM_BASE64("<?php
>> echo base64_encode($data); ?>"), '%')
>>
>> Is the overhead worth it to warrant that kind of safeguard?  That's just a
>> simple query with a simple search criteria.  What about in the case of
>> subselect and multi-table joins?
>>
>>
>> That would indeed be logical if base64 was your chosen method of
>> protection, but I think prepared statements are a far more elegant solution.
>> As for the overhead I very much doubt there's much difference between that
>> and the overhead of prepared statements.
>>
>>
> IIRC, prepared statements doesn't incur any overhead.  Instead, it's
> supposed to enhance performance by telling SQL to 'prepare' via
> compilation.  So if you're comparing performance between the overhead of
> base64 vs prepared statement, then the difference would be quite clear,
> especially when the table(s) is/are more than a couple hundred thousand rows
> and the queri(es) are complex.  This is not mention the added complexity
> into the application where managing and expanding it would incur real
> (developer time) overhead, IMO.
>
>
> Prepared statements incur an additional hit against the DB server to
> prepare the statement.
>
> The cost of using base64 in the manner suggested is minimal, regardless of
> the size of the data. The MySQL query analyser is intelligent enough to know
> that from_base64('xyz') is a constant expression and will therefore only
> evaluate it once.
>
>
Yes, as in your example, if you're inserting 1 row.  What if:

$hobbies = array('bicycling', 'hiking', 'reading', 'skiing', 'swimming');

* base64 method pseudo code:

loop the $hobbies foreach ($hobbies as $hobby)
  INSERT INTO hobbies SET `name` = FROM_BASE64("<?php echo
base64_encode($hobby); ?>")
end loop

* prepared statement pseudo code
prepare statement INSERT INTO hobbies SET `name` = ?
bind param $hobby
loop the $hobbies for ($i = 0; $i < count($hobbies); $i++)
   $hobby = $hobbies[i];
   execute statement
end loop

There would be a difference in performance since the the expression has to
be reevaluated, including the function FROM_BASE, every time versus one time
evaluation of prepared statement.



> As for the added complexity, if you have SQL statements all over your code
> then yes it will add a time overhead, but any codebase of a significant size
> should be using a centralised API for database access such that changes like
> this have a very limited scope.
>

Isn't that one of the major points of OOP?  Still, what about new
developers, having to remember that additional (and most likely unneeded)
complexity, to the project which they would like to build additional
modules/plugins for?


> -Stuart
>
> --
> Stuart Dallas
> 3ft9 Ltd
> http://3ft9.com/
>

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 9:25 PM, Tommy Pham <tommy...@gmail.com> wrote:
> There would be a difference in performance since the the expression has to
> be reevaluated, including the function FROM_BASE, every time versus one time
> evaluation of prepared statement.

This is true, but it should be pointed out that for a large majority
of web applications the performance hit given by either prepared
statements or base64 encoding is going to be miniscule compared to the
bottlenecks already present at the db-access and network-latency
layers. Sites that approach needing to actively worry about the
performance hit from either method are rare, and it's doubtful that
the solution used would be to remove the tactic, assuming the reasons
for the approach being used are sound and still present.

>
>
>
>> As for the added complexity, if you have SQL statements all over your code
>> then yes it will add a time overhead, but any codebase of a significant size
>> should be using a centralised API for database access such that changes like
>> this have a very limited scope.
>>
>
> Isn't that one of the major points of OOP?  Still, what about new
> developers, having to remember that additional (and most likely unneeded)
> complexity, to the project which they would like to build additional
> modules/plugins for?
>

The paragraph you're replying to is saying that this shouldn't be a
pain if your code is well organized. If your codebase is sane, these
details should be transparent to new developers. If they can't be,
then new developers get a chance to learn things :P

--- End Message ---
--- Begin Message ---
On Tue, Oct 4, 2011 at 8:01 PM, Jeremiah Dodds <jeremiah.do...@gmail.com>wrote:

> On Tue, Oct 4, 2011 at 9:25 PM, Tommy Pham <tommy...@gmail.com> wrote:
> > There would be a difference in performance since the the expression has
> to
> > be reevaluated, including the function FROM_BASE, every time versus one
> time
> > evaluation of prepared statement.
>
> This is true, but it should be pointed out that for a large majority
> of web applications the performance hit given by either prepared
> statements or base64 encoding is going to be miniscule compared to the
> bottlenecks already present at the db-access and network-latency
> layers. Sites that approach needing to actively worry about the
> performance hit from either method are rare, and it's doubtful that
> the solution used would be to remove the tactic, assuming the reasons
> for the approach being used are sound and still present.
>
> >
> >
> >
> >> As for the added complexity, if you have SQL statements all over your
> code
> >> then yes it will add a time overhead, but any codebase of a significant
> size
> >> should be using a centralised API for database access such that changes
> like
> >> this have a very limited scope.
> >>
> >
> > Isn't that one of the major points of OOP?  Still, what about new
> > developers, having to remember that additional (and most likely unneeded)
> > complexity, to the project which they would like to build additional
> > modules/plugins for?
> >
>
> The paragraph you're replying to is saying that this shouldn't be a
> pain if your code is well organized. If your codebase is sane, these
> details should be transparent to new developers. If they can't be,
> then new developers get a chance to learn things :P
>
>
My code base, being sane and KIS, wouldn't contain that base64 :P

Anyway, I've spent the last hour or so trying PoC and some metric analysis
but can't seem to get consistent results in execution speed in the MySQL
workbench testing without (restart server before each SQL statement to
prevent use of cache) and with FLUSH+RESET.  Would someone, when you have
time, please see if you're getting the same?  Here are the codes:

* PHP + HTML

<?php
if (isset($_POST['data']) && !empty($_POST['data'])) {
    $data = $_POST['data'];
    $data_base64 = base64_encode($data);
    echo $data.'<br/>'.$data_base64;
}
?>
<form method="post" enctype="multipart/form-data">
<input type="text" name="data" size="100" />
<input type="submit" value="Encode" />
</form>


* MySQL syntax + BASE64_DECODE [1]

DROP TABLE IF EXISTS test.base64;
CREATE TABLE IF NOT EXISTS test.base64 (
  id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  data_ varchar(100) NOT NULL COLLATE utf8_general_ci,
  data_base64 varchar(150) NOT NULL COLLATE utf8_general_ci
);

TRUNCATE test.base64;
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
INSERT INTO test.base64 (data_, data_base64) VALUES ('string to encode
2',BASE64_DECODE('c3RyaW5nIHRvIGVuY29kZSAy')); /* 0.046 sec */
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
PREPARE stmt1 FROM 'INSERT INTO test.base64 (data_, data_base64) VALUES (?,
?)';
 SET @a = 'string to encode 3';
 SET @b = 'string to encode 3';
EXECUTE stmt1 USING @a, @b;  /* 0.015 sec */
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
INSERT INTO test.base64 (data_, data_base64) VALUES ('string to
encode','string to encode'); /* 0.015 sec */
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
SELECT COUNT(*) FROM test.base64; /* 3 rows */
INSERT INTO test.base64 (data_, data_base64) VALUES ('string to
encode','string to encode'); DELETE FROM test.base64;');  /* error after
DELETE -> sample SQL injection */
SELECT COUNT(*) FROM test.base64; /* 0 rows */
INSERT INTO test.base64 (data_, data_base64) VALUES ('string to encode\');
DELETE FROM
test.base64;',BASE64_DECODE('c3RyaW5nIHRvIGVuY29kZScpOyBERUxFVEUgRlJPTSB0ZXN0LmJhc2U2NDs='));
SELECT COUNT(*) FROM test.base64; /* 1 row */


-- test SELECT queries against sample data table with 219,061 rows having 4
rows contains word 'pressure'
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
SELECT * FROM test.base64_product_desc WHERE `name` LIKE CONCAT('%',
BASE64_DECODE('cHJlc3N1cmU='), '%');  /* 0.499 sec / 0.000 sec - subsequent
runs w/o FLUSH/RESET are about same time */
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
SELECT * FROM test.base64_product_desc WHERE `name` LIKE '%pressure%';  /*
0.530 sec / 0.000 sec - subsequent runs are received from cached if not
reset */
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;
PREPARE stmt1 FROM 'SELECT * FROM test.base64_product_desc WHERE `name` LIKE
"%?%"'; /* "%?%"  \'%?%\'  '%?%'  */
 SET @a = 'pressure';
EXECUTE stmt1 USING @a; /* failed to run */
-- FLUSH TABLES; FLUSH PRIVILEGES; RESET QUERY CACHE;


Times recorded are from the initial run for both INSERT and SELECT.

[1] base64.sql attachment from http://bugs.mysql.com/bug.php?id=18861

--- End Message ---

Reply via email to