Re: Escaping placeholders, take 2
On Fri, Jan 09, 2015 at 02:54:56PM -, Greg Sabino Mullane wrote: > > Just a heads up to this list - support for backslash-escaped placeholders > is now implemented in the current production version of DBD::Pg, 3.5.0 > > http://search.cpan.org/dist/DBD-Pg/ > > http://cpansearch.perl.org/src/TURNSTEP/DBD-Pg-3.5.0/Changes Many thanks Greg. Tim.
Re: Escaping placeholders, take 2
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Just a heads up to this list - support for backslash-escaped placeholders is now implemented in the current production version of DBD::Pg, 3.5.0 http://search.cpan.org/dist/DBD-Pg/ http://cpansearch.perl.org/src/TURNSTEP/DBD-Pg-3.5.0/Changes - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201501090952 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlSv64MACgkQvJuQZxSWSsjPNQCfbt80/2VAaRj5qIQG7fm+0TqL kSQAmQHrbf1d22hWx5+NuOeq7TmPccmK =+XvD -END PGP SIGNATURE-
Re: Escaping placeholders, take 2
On Wed, Dec 31, 2014 at 03:55:42AM -, Greg Sabino Mullane wrote: > > Tim Bunce wrote: > > Wherein I attempt to summarize the requirements, the background, the > > options, > > the risks, and the counter-proposal ... > > Thanks for taking the time to write all this out in a coherent manner. You're welcome. I find the effort usually greatly clarifies my thinking. > I may once again assume the role of devil's advocate herein. Thanks. > > In modern application development, however, the DBI is just another > > foundation layer, low down in the stack. Above it are modules like > > SQL::Abstract, and ORMs like DBIx::Class. Companies have large and > > growing investments in these stacks, plus the modules they've built over > > them to provide abstraction and encapsulation of business logic. > > Agreed. > > > For these large applications the attribute workarounds are rarely helpful. > > Is that a true generalization? Or is it only because these large > applications have not bothered to support the attributes? Supporting attributes means supporting (non-standard) numbered placeholders. Supporting numbered placeholders is, as previously described, very far from trivial for SQL::Abstract and thus DBIx::Class. (I certainly wouldn't characterize this as "not bothered".) > > +Some drivers allow you to prevent the recognition of a placeholder by > > placing a > > +single backslash character (C<\>) immediately before it. The driver will > > remove > > +the backslash character and ignore the placeholder, passing it unchanged > > to the > > +backend. If the driver supports this then L(9000) will return > > true. > > Seems a good start. I don't like the phrase "prevent recognition of a > placeholder" > but it's too late at night to propose an alternate right now. s/prevent/disable/? > >> Which likely means that SQL::Abstract and/or DBIx::Class will need to > >> know as well. (Unless they expose the DBD directly to the user, which > >> ruins the point a bit). > > > > Not true. > > How will a user of SQL::Abstract know if escapes are supported or not? > Isn't the job of such frameworks to release the user from such worries, > or at least present some sort of attribute the user can rely on? SQL::Abstract isn't (just) about portability. It allows SQL conditions to be expressed and composed via Perl data structures. That's very powerful functionality. Users who want to use a postgres ? operator know they're using DBD::Pg and can simply require the appropriate version. > > Question marks are the international standard for placeholders. > > The DBI should have specified a way to escape them from the start. > > I'm attempting to fix that now. > > Taking off the devil's advocate hat now. I can agree with this > position and the backslash solution. Great. Thanks Greg. > > To demonstrate the relative complexity, here's a proof-of-concept patch > > for DBD::Pg to implement support for escaping question mark placeholders: > > > > https://github.com/timbunce/dbdpg/commit/54358c7a7efeeaf2666c5e28c301e47624fb9615 > > +1, thanks for that. I certainly cannot see any harm in adding escape support > to DBD::Pg, and then letting the upstream modules start adapting to it. Wonderful. I'll work on polishing up the patch. Tim.
Re: Escaping placeholders, take 2
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Tim Bunce wrote: > Wherein I attempt to summarize the requirements, the background, the options, > the risks, and the counter-proposal ... Thanks for taking the time to write all this out in a coherent manner. I may once again assume the role of devil's advocate herein. > In modern application development, however, the DBI is just another > foundation layer, low down in the stack. Above it are modules like > SQL::Abstract, and ORMs like DBIx::Class. Companies have large and > growing investments in these stacks, plus the modules they've built over > them to provide abstraction and encapsulation of business logic. Agreed. > For these large applications the attribute workarounds are rarely helpful. Is that a true generalization? Or is it only because these large applications have not bothered to support the attributes? > Working 'high up' in an application stack that's dynamically constructing > SQL with placeholders it should be possible to express any desired behavior. Agreed. > Some mechanism is needed to allow an individual character sequence that > normally represents a placeholder, to appear in the SQL without it being > treated as a placeholder by the driver. To "escape" it's usual meaning. But... okay I can't find any fault with this. :) > If escaping question mark placeholders was already defined in 2008 then > pg_placeholder_dollaronly might not have been added to DBD::Pg. Perhaps, perhaps not. pg_placeholder_dollaronly still allows the same SQL to be slung around, including using it inside of Postgres itself. So it's likely both methods would have been available. > 1. Backslash Seems like a good solution. A little ugly, but certainly unlikely to clash with any future SQL-isms. > but that seems very minor as the use of backslashes in single and double > quotes is natural and familar for Perl developers. True. We could recommend always using double quotes - and Perl::Critic's inevitable nagging be damned. :) > 2. Doubling Ugh. No thanks. > I presume there are no database dialects where two adjacent placeholders > would be valid, so I think it's safe. Nope, '??' is a perfectly legal operator in Postgres. A backslash, however, cannot be part of an operator's name. (nor can braces). > 3. Vendor escape clauses Interesting. Probably overkill at this point, but certainly something we could think about supporting. > $sql_abstract->where({ json_field => { '{verbatim ?}' => $bar } }) I find it hard to imagine people getting excited about writing that, and I would imagine SQL::Abstract would simply let people write { "\\?" => $bar } and write the long-form behind their back. > No code will break! Doubling will. But nobody likes that option anyway. > +Some drivers allow you to prevent the recognition of a placeholder by > placing a > +single backslash character (C<\>) immediately before it. The driver will > remove > +the backslash character and ignore the placeholder, passing it unchanged to > the > +backend. If the driver supports this then L(9000) will return > true. Seems a good start. I don't like the phrase "prevent recognition of a placeholder" but it's too late at night to propose an alternate right now. > (I'm undecided about making the get_info(9000) return value a bit-map Ugh, no. Drivers should escape all placeholder types it supports. >> Which likely means that SQL::Abstract and/or DBIx::Class will need to >> know as well. (Unless they expose the DBD directly to the user, which >> ruins the point a bit). > > Not true. How will a user of SQL::Abstract know if escapes are supported or not? Isn't the job of such frameworks to release the user from such worries, or at least present some sort of attribute the user can rely on? >> So we will have both escaped and unescaped versions of SQL >> floating around, subject to the whims of whether or not your particular >> DBD supports it (and in which version). > > There's no such thing as "escaped and unescaped versions of SQL". > That's a false premise. This is a really important point: the syntax > being proposed *is invalid SQL*. I'm not sure of the distinction you are making here. The proposed syntax is invalid, true, and the "original" syntax may or may not be invalid, depending on how you look at it. Portable code that won't know in advance if the underlying dbd supports escaping will need to have two SQLs paths - one with escaping, and one to do it some other way. > Then ribasushi, a principle maintainer of SQL::Abstract and DBIx::Class, > pointed out the fatal flaw: numbered placeholders aren't directly composable. ... > Trying to do the same with numbered placeholders (colon or dollar) would > require parsing and rewriting of the SQL. Clearly non-trivial and slow. > For this reason he wouldn't accept such a patch to SQL::Abstract, and I'd > agree with that position. Non-trivial, but not exactly difficult either, especially considering
Re: Escaping placeholders, take 2
Hi All My apologies. Instead of Text::Balanced::Marpa, I should have pointed you to MarpaX::Languages::SQL2003::AST. Here, AST is Abstract syntax tree. See https://en.wikipedia.org/wiki/Abstract_syntax_tree. -- Ron Savage - savage.net.au
Re: Escaping placeholders, take 2
On Dec 28, 2014, at 5:49 AM, Tim Bunce wrote: > Question marks are the international standard for placeholders. > The DBI should have specified a way to escape them from the start. > I'm attempting to fix that now. +1 David smime.p7s Description: S/MIME cryptographic signature
Re: Escaping placeholders, take 2
Hi Darren On 29/12/14 13:59, Darren Duncan wrote: The API could involve an 'attr' given when preparing a SQL statement or other appropriate places. placeholder_escape_delimiters => [ '\{','}' ] placeholder_escape_delimiters => [ '\{','}','\[',']' ] placeholder_escape_delimiters => [ '\{{{','}}}' ] placeholder_escape_delimiters => [ '{{{','}}}' ] In this way, the backslash is no longer special, or necessary, though I anticipate it would often still be used for the mnemonics. Rather, when the driver is parsing the SQL for placeholders, if it encounters any left delimiter strings, if will leave the following SQL unaltered until it encounters the corresponding right delimiter string, and then it looks for placeholders again. Perhaps a job for https://metacpan.org/release/Text-Balanced-Marpa -- Ron Savage - savage.net.au
Re: Escaping placeholders, take 2
Thank you for this post Tim, it seemed to lay out the issues well and make a lot of things clear. I'm now inclined to support your backslash-escape proposal at the DBI driver level in principle. (I also agree that the doubling method is nasty.) Or alternately I suggest variation on that proposal that brings in shades of the vendor escape clauses. I suggest a variant where instead of a single backslash character indicating an escape, we have some multi-character thing to indicate it, ideally involving delimiters so it is more clear on how far the effect of the new escape feature is supposed to go. For example, using \{?} rather than \? or \{:foo} rather than \:foo. One benefit of that is if you have some SQL that contains ? or : numerous times, you only need to surround the whole fragment with \{} and not individually escape each occurrence, making for neater code. As to dealing with say literal { or } in the SQL as I could see reasonably happening, the quoting mechanism could be made more generic like Perl's "q"/etc, so for example it would take the form \X...X where whatever character appears after the \ is what is matched, and it could be a multiplicity if necessary within reason, eg \{{{...}}} would just work. For that matter, heredocs or the "quoted printable" feature you can see in email messages or such, eg you have \"foo"...foo or some such. I'm speaking in principle here, I'm not proposing a specific feature set, but both "q" as well as Perl 6's related quoting mechanism is useful for guidance, and I think something involving delimiters is best. But if some of that sounds unduly complicated, I have a better idea. I propose that the DBI include an API for users to tell the driver what possible escape delimiters they are using. For example, doable at least at a statement level (and optionally on a connection/etc level for defaulting if that makes sense). The API could involve an 'attr' given when preparing a SQL statement or other appropriate places. placeholder_escape_delimiters => [ '\{','}' ] placeholder_escape_delimiters => [ '\{','}','\[',']' ] placeholder_escape_delimiters => [ '\{{{','}}}' ] placeholder_escape_delimiters => [ '{{{','}}}' ] In this way, the backslash is no longer special, or necessary, though I anticipate it would often still be used for the mnemonics. Rather, when the driver is parsing the SQL for placeholders, if it encounters any left delimiter strings, if will leave the following SQL unaltered until it encounters the corresponding right delimiter string, and then it looks for placeholders again. (As to numbered placeholders, which are effectively a special case of named placeholders, not being directly composable in SQL::Abstract, I see that as being a problem itself. It would be a great help to developers in principle if the native way for working with parameters was named rather than positionally. However, that is really a separate matter to deal with and I think it is a good idea for Tim's proposal in some form to happen regardless of dealing with this separate matter.) -- Darren Duncan
Escaping placeholders, take 2
Wherein I attempt to summarize the requirements, the background, the options, the risks, and the counter-proposal ... ** Background Drivers that support placeholders parse the SQL statement to find them. The parsing is really just simple tokenization that only knows how to skip over quoted strings, quoted identifiers, and comments for the SQL dialect used by the database. The remaining text is checked for placeholders. A question mark is the international SQL standard for placeholders. Per ISO/IEC 9075-2:2003 section 4.24 Dynamic SQL concepts: In SQL-statements that are executed dynamically, the parameters are called dynamic parameters (s) and are represented in SQL language by a (?). Some drivers also support placeholders expressed as a colon ':' followed by a number and/or a name. Some drivers support other forms like a dollar '$' followed by a number. Different types of placeholders cannot be mixed within a statement. The number of parameters passed to execute (etc.) has to match the number of placeholders found in the statement or an error is reported. An individual SQL statement typically comes primarily from one of these sources: 1. Written literally in the source code, perhaps with some $interpolation 2. Constructed dynamically, in whole or part, using a module like SQL::Abstract 3. From an external source, like an SQL script file or 'catalog'. ** Historical Issues and Work-arounds Sometimes the driver will identify placeholders in the SQL statement that weren't intended by the developer. So some drivers have added ad-hoc ways to limit the parsing of placeholders. For example: pg_direct - query passed directly without parsing for placeholders at all pg_placeholder_dollaronly - question marks are not treated as placeholders pg_placeholder_nocolons - colons are not treated as placeholders ora_placeholders - disable processing of all placeholders odbc_ignore_named_placeholders - disable processing of :foo placeholders In most cases these have been added as a workaround for statements that contain colons followed by words that the driver was treating as placeholders. (For example :old and :new when defining Oracle triggers.) The exception is pg_placeholder_dollaronly which was added in 2008 to allow use of postgres geometric operators like "?||" ("are parallel") See http://www.postgresql.org/docs/9.4/static/functions-geometry.html So supporting multiple styles of placeholders has both caused problems and allowed drivers to "dodge the bullet" when one style has become a problem in a particular situation. But that's only a limited solution. It doesn't promote the development and use of generic higher-level reusable modules. (You could frame the progress of software development in terms of enabling developers to work at ever-higher levels of abstraction.) ** Limitation of Existing Workarounds A key problem with these attribute workarounds is that they affect the entire statement. That's a good approach when loading statements from an external source, and often ok when writing individual DBI calls directly where an alternative placeholder style or quote()'ing can be used. In modern application development, however, the DBI is just another foundation layer, low down in the stack. Above it are modules like SQL::Abstract, and ORMs like DBIx::Class. Companies have large and growing investments in these stacks, plus the modules they've built over them to provide abstraction and encapsulation of business logic. For these large applications the attribute workarounds are rarely helpful. Working 'high up' in an application stack that's dynamically constructing SQL with placeholders it should be possible to express any desired behavior. ** Requirements Some mechanism is needed to allow an individual character sequence that normally represents a placeholder, to appear in the SQL without it being treated as a placeholder by the driver. To "escape" it's usual meaning. If escaping question mark placeholders was already defined in 2008 then pg_placeholder_dollaronly might not have been added to DBD::Pg. The issue of escaping placeholders is on the table now because PostgreSQL supports a JSON type with a rich set of features and operators, some of which use a question mark. For example, the expression json_type_field ? 'bar' is true if the string 'bar' exists as a key/element exist within the JSON value of json_type_field. For more information see http://www.postgresql.org/docs/9.4/static/functions-json.html Use of JSON and these operators is likely to be much more common than use of the geometric operators mentioned earlier. So, while the needs of Postgres users is what prompted this topic now, the requirement is more general. It would be good to agree on an approach that could be applied to most drivers if the need arises. ** Options There seem to be several options worth considering 1. Backslash A backslash could be placed immediately before a question mark t