Re: Escaping placeholders

2014-12-21 Thread peter
On 20.12.2014 15:38, Tim Bunce wrote:
 [...]
 Can you, or anyone else, think of any situation where a backslash before
 a ? or :foo (or even $1) style placeholder might be valid SQL?

Inside quoted text: of course, yes.
Outside quoted text: maybe some RDBMS accept a backslash as a valid character
in the name of a table or column? (Haven't tried this out yet.)
At least make sure \? will not be handled as placeholder inside column name
quoting, as in e.g.
foo \? bar
or in
[foo \? bar]
with SQL Server, or
`foo \? bar`
for MySQL.
But my fear is that even
foo\?bar
would be accepted as valid column or table name by some RDBMS...

--  Peter Vanroose,
ABIS Training  Consulting
Leuven, Belgium.



Re: Escaping placeholders

2014-12-21 Thread Tim Bunce
On Sat, Dec 20, 2014 at 02:23:43PM -0800, Jonathan Leffler wrote:
 
  INSERT INTO SomeTable(DateTimeCol)
  VALUES(DATETIME[1](2014-12-31 23\:59\:59) YEAR TO SECOND);
 
I really won't want people have to futz with their legitimate Informix SQL 
 in order to pass it through
DBD::Informix.

Whatever is provided, whether by DBI or DBD::Informix, must accept the 
 code without the
backslashes in front of the colons.  It is simply not acceptable to have 
 to modify valid SQL to get it
past the gatekeeper code.

Understood. I wouldn't expect DBD::Informix to enable this by default.

We were only discussing a hypothetical situation where DBD::Informix
could optionally enable use of colon placeholders, if desired.

At the moment, the unescaped code works fine.  It will continue to work 
 fine.  As long as DBI does not
break the currently working code, I will survive �� like I have for the 
 last decade and more.  Just
make sure that whatever you do does not break working valid Informix SQL 
 code.

I have absolutely no intention of breaking anything :)

It'll be up to the individual driver authors to add support for escaping
placeholders, if they want to.

(The DBI has a built-in preparse function that's intended for parsing
placeholders but few, if any, drivers use it. I know DBD::Informix doesn't.)

Tim.

On Sat, Dec 20, 2014 at 2:17 PM, Tim Bunce [2]tim.bu...@pobox.com wrote:
 
  On Sat, Dec 20, 2014 at 01:14:29PM -0800, Jonathan Leffler wrote:
  Many, many years ago, DBD::Informix had to give up on the 
 DBI-provided parsing for placeholders
  because
  there were too many contexts in which it was wrong for Informix.  
 It may have improved since
  then, but:
  
INSERT INTO SomeTable(DateTimeCol)
  VALUES(DATETIME(2014-12-31 23:59:59) YEAR TO SECOND);
 
  I think I raised this as an issue back in the 1996-1998 timeframe 
 (I said 'many years ago' and
  meant
  it).  I'd have to dig through my release notes to be more precise.  
 Informix only supports
  natively the
  `?` placeholders.  It doesn't yet have the complexities introduced 
 by the PostgreSQL operators.
  
  I don't know whether this can be handled at all.  It may be that 
 DBD::Informix has to stay out in
  isolation but it would be nice if it wasn't necessary.
 
  The `?` placeholders are 'standard' (for some definition) so 
 DBD::Informix
  isn't really 'in isolation'. There are quite a few drivers that only
  support `?` placeholders.
 
  In theory, if this proposal goes ahead, and is applied to `:` 
 placeholders
  as seems likely, then you'd be able to write the above as:
  INSERT INTO SomeTable(DateTimeCol)
VALUES(DATETIME[3](2014-12-31 23\:59\:59) YEAR TO SECOND);
 
  Tim.
 
--
Jonathan Leffler [4]jonathan.leff...@gmail.com  #include disclaimer.h
Guardian of DBD::Informix - v2013.0521 - [5]http://dbi.perl.org
Blessed are we who can laugh at ourselves, for we shall never cease to be 
 amused.
 
 References
 
Visible links
1. file:///tmp/tel:%282014-12-31%2023
2. mailto:tim.bu...@pobox.com
3. file:///tmp/tel:%282014-12-31%2023
4. mailto:jonathan.leff...@gmail.com
5. http://dbi.perl.org/


Re: Escaping placeholders

2014-12-21 Thread Alexander Foken

On 20.12.2014 23:10, Tim Bunce wrote:

On Sat, Dec 20, 2014 at 05:35:55PM +0100, Alexander Foken wrote:

On 20.12.2014 15:38, Tim Bunce wrote:

Can you, or anyone else, think of any situation where a backslash before
a ? or :foo (or even $1) style placeholder might be valid SQL?

[...]

Do you plan to escape the escape character, i.e. use a double
backslash at DBI level to represent a single backslash at database
level?

That's a good question. I'm not sure. I think the answer has to be no.
I'd welcome any input on that.


The current problem with any kind of DBI placeholders is that there is 
no way to prevent DBI (including any DBDs) to interpret them as 
placeholders. This hasn't changed since 2000. The patches from 2000 
disable placeholders by type (?, $1, :foo) or entirely, but it is not 
possible to pass :foo unchanged and uninterpreted to the database while 
at the same time having a named placeholder :bar.


The proposed backslash escape allows exactly this, so you can have a 
named placeholder :bar and can pass :foo unchanged and uninterpreted to 
the database, but you have to add a backslash in front of :foo. No 
problems with old code for this usage, because this was simply not possible.


But with the proposed backslash escape, the backslash in SQL statements 
passed to the DBI will suddenly have a new meaning, were it is currently 
passed unmodified to the database. This will break old code.


Making the backslash escape special only directly in front of possible 
placeholders will probably break less code, because all other 
backslashes will behave as before. At the same time, the rules when a 
backslash is needed will become more complicated, and they may depend on 
the DBD, because different DBDs implement differrent placeholders (as 
far as I understand DBI).


Also, the rule that a backslash in front of a possible placeholder 
passes the placeholder, but not the backslash, unmodified to the 
database prevents passing a backslash followed by an actual placeholder 
to the database. This is possible with the current DBI code.


So, the backslash-is-only-special-in-front-of-placeholders rule breaks 
some existing code, makes it impossible to have a backslask in front of 
a placeholder, and the rules for when to use a backslash depend on the 
DBD implementation. This will likely lead to another change that will 
again changes the rules and break old code.


Can we do better? I think that we can:

As there is currently no escaping mechanism for placeholders in DBI, any 
new escaping mechanism will break some existing code. Breaking old code 
that uses DBI has happened before, e.g. in  DBI 1.20, 24th August 2001 
(selectall_hashref) DBI 1.19, 20th July 2001 (fetchall_arrayref), DBI 
0.91, 10th December 1997 (connect).


1. Use a clean escaping mechanism that won't need future changes that 
will again break old code:


Make all backslashes in SQL special. Require double backslashes for a 
single backslash at database level. Require escaping characters that 
otherwise might be parsed as placeholders (:, ?, $). Make unknown escape 
sequences fatal errors. This allows mixing placeholders and 
placeholder-like SQL sequences as before, it allows placing backslashes 
in front of placeholders, and it allows future extensions using the 
currently unknown escape sequences.


Existing code just has to double each and every backslash in SQL code.

2. Slowly introduce the new mechanism:

First, the backslash-escape must explicitly be enabled using an 
attribute BackslashEscape during connect(). Default is off, i.e. no 
backslash escaping. No old code should have problems. New code can 
enable backslash-escapes when needed by using connect() with 
BackslashEscape=on.


After a while, change the default to warn, so that backslash escaping 
stays off, but any backslash in SQL statements will issue a 
compatibility warning. Old code will still run unmodified, but will 
likely issue some warnings. Those warnings can be disabled by adding 
BackslashEscape=off to connect().


After some more time, make changing BackslashEscape to off cause a 
compatibility warning. Old unmaintained code will still run unmodified, 
but issue some warnings. Maintained old code that sets BackslashEscape 
to off will issue one warning per connect.


After even more time, change the default to on, so backslash escaping is 
enabled. This will intentionally break old unmaintained code that was 
not modified to switch off BackslashEscape. Old maintained code will 
continue to run, but issue one warning per connect.


Finally, remove the non-backslash-escape code and make swiching to that 
code (off or warn) a fatal error. This will intentionally break old code 
that had a one-time maintainance but was not changed to use backslashes 
properly.


The above is the user view of DBI.

DBI developer view:

As far as I understand the DBI, DBDs may also parse SQL code. They need 
to be changed to respect the BackslashEscape attribute. Using old 

Re: Escaping placeholders

2014-12-21 Thread Tim Bunce
On Sun, Dec 21, 2014 at 10:27:18AM +0100, pe...@vanroose.be wrote:
 On 20.12.2014 15:38, Tim Bunce wrote:
  [...]
  Can you, or anyone else, think of any situation where a backslash before
  a ? or :foo (or even $1) style placeholder might be valid SQL?
 
 Inside quoted text: of course, yes.
 Outside quoted text: maybe some RDBMS accept a backslash as a valid character
 in the name of a table or column? (Haven't tried this out yet.)
 At least make sure \? will not be handled as placeholder inside column name
 quoting, as in e.g.
 foo \? bar

That's standard identifier quoting so should be ignored by the driver
for the same reason that single quoted strings are.

 or in
 [foo \? bar]
 with SQL Server, or
 `foo \? bar`
 for MySQL.

For databases that support those non-standard identifier quoting styles
the driver should be treating them as strings and so skipping them anyway.

 But my fear is that even
 foo\?bar
 would be accepted as valid column or table name by some RDBMS...

I find that very hard to believe without any evidence.
Even if true, the driver for such a database would, I presume,
treat that ? as a placeholder and so it would already be broken.

Tim.


Re: Escaping placeholders

2014-12-21 Thread Greg Sabino Mullane (the tenth man)

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Tim Bunce wrote:

 For code not using DBIx::Class the pg_placeholder_dollaronly attribute
 might work, see https://metacpan.org/pod/DBD::Pg#Placeholders

Yes, this is the current canonical solution. It's a real shame 
that ? was used as an operator, but that horse has left the barn.

 For code using DBIx::Class the problem is more tricky. I'm pretty sure
 that SQL::Abstract and thus DBIx::Class only support question mark
 placeholders. That means it probably impossible to use expressions
 containing a question mark operator with SQL::Abstract/DBIx::Class.
 (Though I'd be delighted to be proven wrong.)
 
 So I think the DBI spec for placeholders needs to be extended to allow a
 way to 'escape' a question mark that the driver would otherwise treat as
 a placeholder.

 The obvious and natural approach would be to use a backslash before a
 question mark. The backslash would be removed by the driver before the
 statement is passed to the backend.

I'm going to play devil's advocate a bit here. There are some problems with 
this approach. First, it will require that the user know if the underlying 
DBD supports backslashes. Which likely means that SQL::Abstract and/or 
DBIx::Class will need to know as well. (Unless they expose the DBD directly 
to the user, which ruins the point a bit). Since we will thus need to patch 
those other modules, so why not fix them to do the right thing? (see below 
for a counter proposal).

Another problem is that we have now extended the SQL syntax for our own 
purposes. While one could argue that placeholders already do so, their 
current use is consistent, widespread (e.g. not just DBI), and in part 
used by the underlying RDBMSs themselves (e.g. Postgres uses dollar-number 
placeholders). So we will have both escaped and unescaped versions of SQL 
floating around, subject to the whims of whether or not your particular 
DBD supports it (and in which version). All of which seems like an awful 
lot of work to fix SQL::Abstract. Why bother patching every DBD in 
existence when we can simply patch SQL::Abstract?

Which leads to my counter-proposal: have SQL::Abstract accept dollar-number 
placeholders. It can pass pg_placeholder_dollaronly down the stack as 
needed. This neatly puts the onus back onto the frameworks, rather than 
having the DBDs selectively remove backslashes before passing to the 
RDBMS (ick). DBIx::Class and friends could even map dollar signs back to 
a format supported by the underlying DBDs, if they don't support dollar 
signs (that is one of their grand purposes after all - abstracting out 
details and taking care of things in the background).

- -- 
Greg Sabino Mullane g...@turnstep.com
PGP Key: 0x14964AC8 201412211008
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlSW5E8ACgkQvJuQZxSWSshusgCfXOIjbJxQAF9s7FJEncDFoBGG
oHgAnjMs9kP/imrZTnknJpUIuXOhmoPL
=FxZ4
-END PGP SIGNATURE-




Re: Escaping placeholders

2014-12-21 Thread Darren Duncan

I agree with Greg's counter-proposal, from which I derive my own words here.

1.  I propose that there be no modification to the DBI spec related to new 
escaping whether optional or not, so leave things the way they are here, SQL 
continues to be interpreted the way it long has by default.


2.  When users want to use operators in PostgreSQL that contain literal ? in 
their names, then they enable DBD::Pg's pg_placeholder_dollaronly so that ? are 
no longer treated as placeholders.  Likewise, pg_placeholder_nocolons can be 
enabled when they don't want literal : to indicate a placeholder either.  Users 
would either do this directly if they're using DBI/DBD::Pg directly, or 
indirectly using their over-top framework of choice.  When users aren't using 
the ? operators et al, they can leave things alone which will work as normal.


3.  SQL::Abstract et al, those tools external to DBI/DBDs, are the correct and 
proper places to modify where users of said want to use the operators with ? 
names and such.  These tools already have special knowledge of individual DBMS 
systems to work with them effectively, and the ? operators is just one more of 
those things.  The users of said tools may have to flip a configuration switch 
possibly so $1 etc are used behind the scenes, if necessary, but that's okay 
because the use of ? operators only happens when the users choose to make a 
change to use them anyway.


In summary, now is not the time or place to be introducing backslashing doubled 
or otherwise in DBI such as discussed, that's a poor solution and its better to 
save such risky/etc changes for when there's a more compelling case for them.


I should also mention I feel it is perfectly reasonable for each DBMS to have 
operators composed of any characters they want where doing so makes sense within 
the context of the SQL/etc itself.  See also that Perl itself has both ? and : 
and etc as operator names, Perl 6 even more so, and I don't see anyone thinking 
that's a bad idea.  So I have no problem with PostgreSQL having ? in operator 
names such as it did.  Its not like the SQL standard reserves ? or whatever for 
prepared statement parameters, that's defined to be implementation dependent I 
believe (WD 9075-2:200w(E) 20.6 prepare statement).


-- Darren Duncan

On 2014-12-21 7:17 AM, Greg Sabino Mullane (the tenth man) wrote:

Tim Bunce wrote:


For code not using DBIx::Class the pg_placeholder_dollaronly attribute
might work, see https://metacpan.org/pod/DBD::Pg#Placeholders


Yes, this is the current canonical solution. It's a real shame
that ? was used as an operator, but that horse has left the barn.


For code using DBIx::Class the problem is more tricky. I'm pretty sure
that SQL::Abstract and thus DBIx::Class only support question mark
placeholders. That means it probably impossible to use expressions
containing a question mark operator with SQL::Abstract/DBIx::Class.
(Though I'd be delighted to be proven wrong.)

So I think the DBI spec for placeholders needs to be extended to allow a
way to 'escape' a question mark that the driver would otherwise treat as
a placeholder.

The obvious and natural approach would be to use a backslash before a
question mark. The backslash would be removed by the driver before the
statement is passed to the backend.


I'm going to play devil's advocate a bit here. There are some problems with
this approach. First, it will require that the user know if the underlying
DBD supports backslashes. Which likely means that SQL::Abstract and/or
DBIx::Class will need to know as well. (Unless they expose the DBD directly
to the user, which ruins the point a bit). Since we will thus need to patch
those other modules, so why not fix them to do the right thing? (see below
for a counter proposal).

Another problem is that we have now extended the SQL syntax for our own
purposes. While one could argue that placeholders already do so, their
current use is consistent, widespread (e.g. not just DBI), and in part
used by the underlying RDBMSs themselves (e.g. Postgres uses dollar-number
placeholders). So we will have both escaped and unescaped versions of SQL
floating around, subject to the whims of whether or not your particular
DBD supports it (and in which version). All of which seems like an awful
lot of work to fix SQL::Abstract. Why bother patching every DBD in
existence when we can simply patch SQL::Abstract?

Which leads to my counter-proposal: have SQL::Abstract accept dollar-number
placeholders. It can pass pg_placeholder_dollaronly down the stack as
needed. This neatly puts the onus back onto the frameworks, rather than
having the DBDs selectively remove backslashes before passing to the
RDBMS (ick). DBIx::Class and friends could even map dollar signs back to
a format supported by the underlying DBDs, if they don't support dollar
signs (that is one of their grand purposes after all - abstracting out
details and taking care of things in the background).