Re: Escaping placeholders, take 2

2015-01-09 Thread Tim Bunce
On Fri, Jan 09, 2015 at 02:54:56PM -, Greg Sabino Mullane wrote:
> 
> Just a heads up to this list - support for backslash-escaped placeholders 
> is now implemented in the current production version of DBD::Pg, 3.5.0
> 
> http://search.cpan.org/dist/DBD-Pg/
> 
> http://cpansearch.perl.org/src/TURNSTEP/DBD-Pg-3.5.0/Changes

Many thanks Greg.

Tim.


Re: Escaping placeholders, take 2

2015-01-09 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Just a heads up to this list - support for backslash-escaped placeholders 
is now implemented in the current production version of DBD::Pg, 3.5.0

http://search.cpan.org/dist/DBD-Pg/

http://cpansearch.perl.org/src/TURNSTEP/DBD-Pg-3.5.0/Changes

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201501090952
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlSv64MACgkQvJuQZxSWSsjPNQCfbt80/2VAaRj5qIQG7fm+0TqL
kSQAmQHrbf1d22hWx5+NuOeq7TmPccmK
=+XvD
-END PGP SIGNATURE-




Re: Escaping placeholders, take 2

2015-01-02 Thread Tim Bunce
On Wed, Dec 31, 2014 at 03:55:42AM -, Greg Sabino Mullane wrote:
> 
> Tim Bunce wrote:
> > Wherein I attempt to summarize the requirements, the background, the 
> > options,
> > the risks, and the counter-proposal ...
> 
> Thanks for taking the time to write all this out in a coherent manner.

You're welcome. I find the effort usually greatly clarifies my thinking.

> I may once again assume the role of devil's advocate herein.

Thanks.

> > In modern application development, however, the DBI is just another
> > foundation layer, low down in the stack. Above it are modules like
> > SQL::Abstract, and ORMs like DBIx::Class. Companies have large and
> > growing investments in these stacks, plus the modules they've built over
> > them to provide abstraction and encapsulation of business logic.
> 
> Agreed.
> 
> > For these large applications the attribute workarounds are rarely helpful.
> 
> Is that a true generalization? Or is it only because these large 
> applications have not bothered to support the attributes?

Supporting attributes means supporting (non-standard) numbered placeholders.
Supporting numbered placeholders is, as previously described, very far
from trivial for SQL::Abstract and thus DBIx::Class.  (I certainly
wouldn't characterize this as "not bothered".)


> > +Some drivers allow you to prevent the recognition of a placeholder by 
> > placing a
> > +single backslash character (C<\>) immediately before it. The driver will 
> > remove
> > +the backslash character and ignore the placeholder, passing it unchanged 
> > to the
> > +backend. If the driver supports this then L(9000) will return 
> > true.
> 
> Seems a good start. I don't like the phrase "prevent recognition of a 
> placeholder" 
> but it's too late at night to propose an alternate right now.

s/prevent/disable/?


> >> Which likely means that SQL::Abstract and/or DBIx::Class will need to
> >> know as well. (Unless they expose the DBD directly to the user, which
> >> ruins the point a bit).
> >
> > Not true.
> 
> How will a user of SQL::Abstract know if escapes are supported or not? 
> Isn't the job of such frameworks to release the user from such worries, 
> or at least present some sort of attribute the user can rely on?

SQL::Abstract isn't (just) about portability. It allows SQL conditions
to be expressed and composed via Perl data structures. That's very
powerful functionality. Users who want to use a postgres ? operator know
they're using DBD::Pg and can simply require the appropriate version.


> > Question marks are the international standard for placeholders.
> > The DBI should have specified a way to escape them from the start.
> > I'm attempting to fix that now.
> 
> Taking off the devil's advocate hat now. I can agree with this 
> position and the backslash solution.

Great. Thanks Greg.

> > To demonstrate the relative complexity, here's a proof-of-concept patch
> > for DBD::Pg to implement support for escaping question mark placeholders:
> >
> > https://github.com/timbunce/dbdpg/commit/54358c7a7efeeaf2666c5e28c301e47624fb9615
> 
> +1, thanks for that. I certainly cannot see any harm in adding escape support 
> to DBD::Pg, and then letting the upstream modules start adapting to it.

Wonderful. I'll work on polishing up the patch.

Tim.


Re: Escaping placeholders, take 2

2014-12-30 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Tim Bunce wrote:
> Wherein I attempt to summarize the requirements, the background, the options,
> the risks, and the counter-proposal ...

Thanks for taking the time to write all this out in a coherent manner.
I may once again assume the role of devil's advocate herein.

> In modern application development, however, the DBI is just another
> foundation layer, low down in the stack. Above it are modules like
> SQL::Abstract, and ORMs like DBIx::Class. Companies have large and
> growing investments in these stacks, plus the modules they've built over
> them to provide abstraction and encapsulation of business logic.

Agreed.

> For these large applications the attribute workarounds are rarely helpful.

Is that a true generalization? Or is it only because these large 
applications have not bothered to support the attributes?

> Working 'high up' in an application stack that's dynamically constructing
> SQL with placeholders it should be possible to express any desired behavior.

Agreed.

> Some mechanism is needed to allow an individual character sequence that
> normally represents a placeholder, to appear in the SQL without it being
> treated as a placeholder by the driver. To "escape" it's usual meaning.

But...  okay I can't find any fault with this. :)

> If escaping question mark placeholders was already defined in 2008 then
> pg_placeholder_dollaronly might not have been added to DBD::Pg.

Perhaps, perhaps not. pg_placeholder_dollaronly still allows the same SQL to 
be slung around, including using it inside of Postgres itself. So it's 
likely both methods would have been available.

> 1. Backslash

Seems like a good solution. A little ugly, but certainly unlikely to 
clash with any future SQL-isms.

> but that seems very minor as the use of backslashes in single and double
> quotes is natural and familar for Perl developers.

True. We could recommend always using double quotes - and Perl::Critic's 
inevitable nagging be damned. :)

> 2. Doubling

Ugh. No thanks.

> I presume there are no database dialects where two adjacent placeholders 
> would be valid, so I think it's safe.

Nope, '??' is a perfectly legal operator in Postgres. A backslash, however, 
cannot be part of an operator's name. (nor can braces).

> 3. Vendor escape clauses

Interesting. Probably overkill at this point, but certainly something 
we could think about supporting.

> $sql_abstract->where({ json_field => { '{verbatim ?}' => $bar } })

I find it hard to imagine people getting excited about writing 
that, and I would imagine SQL::Abstract would simply let people write 
{ "\\?" => $bar } and write the long-form behind their back.

> No code will break!

Doubling will. But nobody likes that option anyway.

> +Some drivers allow you to prevent the recognition of a placeholder by 
> placing a
> +single backslash character (C<\>) immediately before it. The driver will 
> remove
> +the backslash character and ignore the placeholder, passing it unchanged to 
> the
> +backend. If the driver supports this then L(9000) will return 
> true.

Seems a good start. I don't like the phrase "prevent recognition of a 
placeholder" 
but it's too late at night to propose an alternate right now.

> (I'm undecided about making the get_info(9000) return value a bit-map

Ugh, no. Drivers should escape all placeholder types it supports.

>> Which likely means that SQL::Abstract and/or DBIx::Class will need to
>> know as well. (Unless they expose the DBD directly to the user, which
>> ruins the point a bit).
>
> Not true.

How will a user of SQL::Abstract know if escapes are supported or not? 
Isn't the job of such frameworks to release the user from such worries, 
or at least present some sort of attribute the user can rely on?

>> So we will have both escaped and unescaped versions of SQL 
>> floating around, subject to the whims of whether or not your particular 
>> DBD supports it (and in which version).
>
> There's no such thing as "escaped and unescaped versions of SQL".
> That's a false premise.  This is a really important point: the syntax
> being proposed *is invalid SQL*.

I'm not sure of the distinction you are making here. The proposed 
syntax is invalid, true, and the "original" syntax may or may not be 
invalid, depending on how you look at it. Portable code that won't 
know in advance if the underlying dbd supports escaping will need 
to have two SQLs paths - one with escaping, and one to do it some 
other way.

> Then ribasushi, a principle maintainer of SQL::Abstract and DBIx::Class,
> pointed out the fatal flaw: numbered placeholders aren't directly composable.
...
> Trying to do the same with numbered placeholders (colon or dollar) would
> require parsing and rewriting of the SQL.  Clearly non-trivial and slow.
> For this reason he wouldn't accept such a patch to SQL::Abstract, and I'd
> agree with that position.

Non-trivial, but not exactly difficult either, especially considering 

Re: Escaping placeholders, take 2

2014-12-30 Thread Ron Savage

Hi All

My apologies. Instead of Text::Balanced::Marpa, I should have pointed 
you to MarpaX::Languages::SQL2003::AST.


Here, AST is Abstract syntax tree. See 
https://en.wikipedia.org/wiki/Abstract_syntax_tree.


--
Ron Savage - savage.net.au


Re: Escaping placeholders, take 2

2014-12-29 Thread David E. Wheeler
On Dec 28, 2014, at 5:49 AM, Tim Bunce  wrote:

> Question marks are the international standard for placeholders.
> The DBI should have specified a way to escape them from the start.
> I'm attempting to fix that now.

+1

David



smime.p7s
Description: S/MIME cryptographic signature


Re: Escaping placeholders, take 2

2014-12-28 Thread Ron Savage

Hi Darren

On 29/12/14 13:59, Darren Duncan wrote:


The API could involve an 'attr' given when preparing a SQL statement or
other appropriate places.

   placeholder_escape_delimiters => [ '\{','}' ]
   placeholder_escape_delimiters => [ '\{','}','\[',']' ]
   placeholder_escape_delimiters => [ '\{{{','}}}' ]
   placeholder_escape_delimiters => [ '{{{','}}}' ]

In this way, the backslash is no longer special, or necessary, though I
anticipate it would often still be used for the mnemonics.

Rather, when the driver is parsing the SQL for placeholders, if it
encounters any left delimiter strings, if will leave the following SQL
unaltered until it encounters the corresponding right delimiter string,
and then it looks for placeholders again.


Perhaps a job for https://metacpan.org/release/Text-Balanced-Marpa

--
Ron Savage - savage.net.au


Re: Escaping placeholders, take 2

2014-12-28 Thread Darren Duncan
Thank you for this post Tim, it seemed to lay out the issues well and make a lot 
of things clear.


I'm now inclined to support your backslash-escape proposal at the DBI driver 
level in principle.


(I also agree that the doubling method is nasty.)

Or alternately I suggest variation on that proposal that brings in shades of the 
vendor escape clauses.


I suggest a variant where instead of a single backslash character indicating an 
escape, we have some multi-character thing to indicate it, ideally involving 
delimiters so it is more clear on how far the effect of the new escape feature 
is supposed to go.


For example, using \{?} rather than \? or \{:foo} rather than \:foo.

One benefit of that is if you have some SQL that contains ? or : numerous times, 
you only need to surround the whole fragment with \{} and not individually 
escape each occurrence, making for neater code.


As to dealing with say literal { or } in the SQL as I could see reasonably 
happening, the quoting mechanism could be made more generic like Perl's "q"/etc, 
so for example it would take the form \X...X where whatever character appears 
after the \ is what is matched, and it could be a multiplicity if necessary 
within reason, eg \{{{...}}} would just work.  For that matter, heredocs or the 
"quoted printable" feature you can see in email messages or such, eg you have 
\"foo"...foo or some such.


I'm speaking in principle here, I'm not proposing a specific feature set, but 
both "q" as well as Perl 6's related quoting mechanism is useful for guidance, 
and I think something involving delimiters is best.


But if some of that sounds unduly complicated, I have a better idea.

I propose that the DBI include an API for users to tell the driver what possible 
escape delimiters they are using.  For example, doable at least at a statement 
level (and optionally on a connection/etc level for defaulting if that makes sense).


The API could involve an 'attr' given when preparing a SQL statement or other 
appropriate places.


  placeholder_escape_delimiters => [ '\{','}' ]
  placeholder_escape_delimiters => [ '\{','}','\[',']' ]
  placeholder_escape_delimiters => [ '\{{{','}}}' ]
  placeholder_escape_delimiters => [ '{{{','}}}' ]

In this way, the backslash is no longer special, or necessary, though I 
anticipate it would often still be used for the mnemonics.


Rather, when the driver is parsing the SQL for placeholders, if it encounters 
any left delimiter strings, if will leave the following SQL unaltered until it 
encounters the corresponding right delimiter string, and then it looks for 
placeholders again.


(As to numbered placeholders, which are effectively a special case of named 
placeholders, not being directly composable in SQL::Abstract, I see that as 
being a problem itself.  It would be a great help to developers in principle if 
the native way for working with parameters was named rather than positionally. 
However, that is really a separate matter to deal with and I think it is a good 
idea for Tim's proposal in some form to happen regardless of dealing with this 
separate matter.)


-- Darren Duncan



Escaping placeholders, take 2

2014-12-28 Thread Tim Bunce
Wherein I attempt to summarize the requirements, the background, the options,
the risks, and the counter-proposal ...

** Background

Drivers that support placeholders parse the SQL statement to find them.
The parsing is really just simple tokenization that only knows how to skip
over quoted strings, quoted identifiers, and comments for the SQL dialect
used by the database. The remaining text is checked for placeholders.

A question mark is the international SQL standard for placeholders.
Per ISO/IEC 9075-2:2003 section 4.24 Dynamic SQL concepts:

In SQL-statements that are executed dynamically, the parameters are
called dynamic parameters (s) and
are represented in SQL language by a  (?).

Some drivers also support placeholders expressed as a colon ':' followed
by a number and/or a name. Some drivers support other forms like a dollar
'$' followed by a number. Different types of placeholders cannot be
mixed within a statement.

The number of parameters passed to execute (etc.) has to match the number
of placeholders found in the statement or an error is reported.

An individual SQL statement typically comes primarily from one of these sources:
1. Written literally in the source code, perhaps with some $interpolation
2. Constructed dynamically, in whole or part, using a module like SQL::Abstract
3. From an external source, like an SQL script file or 'catalog'.


** Historical Issues and Work-arounds

Sometimes the driver will identify placeholders in the SQL statement
that weren't intended by the developer.  So some drivers have added
ad-hoc ways to limit the parsing of placeholders.  For example:

  pg_direct - query passed directly without parsing for placeholders at all
  pg_placeholder_dollaronly - question marks are not treated as placeholders
  pg_placeholder_nocolons - colons are not treated as placeholders
  ora_placeholders - disable processing of all placeholders
  odbc_ignore_named_placeholders - disable processing of :foo placeholders

In most cases these have been added as a workaround for statements that
contain colons followed by words that the driver was treating as placeholders.
(For example :old and :new when defining Oracle triggers.)

The exception is pg_placeholder_dollaronly which was added in 2008 to
allow use of postgres geometric operators like "?||" ("are parallel")
See http://www.postgresql.org/docs/9.4/static/functions-geometry.html

So supporting multiple styles of placeholders has both caused problems
and allowed drivers to "dodge the bullet" when one style has become a
problem in a particular situation. But that's only a limited solution.
It doesn't promote the development and use of generic higher-level reusable
modules. (You could frame the progress of software development in terms
of enabling developers to work at ever-higher levels of abstraction.)


** Limitation of Existing Workarounds

A key problem with these attribute workarounds is that they affect the
entire statement.  That's a good approach when loading statements from
an external source, and often ok when writing individual DBI calls directly
where an alternative placeholder style or quote()'ing can be used.

In modern application development, however, the DBI is just another
foundation layer, low down in the stack. Above it are modules like
SQL::Abstract, and ORMs like DBIx::Class. Companies have large and
growing investments in these stacks, plus the modules they've built over
them to provide abstraction and encapsulation of business logic.

For these large applications the attribute workarounds are rarely helpful.

Working 'high up' in an application stack that's dynamically constructing
SQL with placeholders it should be possible to express any desired behavior.


** Requirements

Some mechanism is needed to allow an individual character sequence that
normally represents a placeholder, to appear in the SQL without it being
treated as a placeholder by the driver. To "escape" it's usual meaning.

If escaping question mark placeholders was already defined in 2008 then
pg_placeholder_dollaronly might not have been added to DBD::Pg.

The issue of escaping placeholders is on the table now because
PostgreSQL supports a JSON type with a rich set of features and
operators, some of which use a question mark.  For example, the
expression json_type_field ? 'bar' is true if the string 'bar' exists as
a key/element exist within the JSON value of json_type_field. For more
information see http://www.postgresql.org/docs/9.4/static/functions-json.html
Use of JSON and these operators is likely to be much more common than
use of the geometric operators mentioned earlier.

So, while the needs of Postgres users is what prompted this topic now,
the requirement is more general. It would be good to agree on an approach
that could be applied to most drivers if the need arises.


** Options

There seem to be several options worth considering

1. Backslash

A backslash could be placed immediately before a question mark t