Re: Missing double quotes around table and column names

Alok Menghrajani Tue, 09 Dec 2014 10:07:16 -0800

On Tue, Dec 9, 2014 at 12:45 AM, Lukas Eder <[email protected]> wrote:
>
>
> 2014-12-09 0:37 GMT+01:00 Alok Menghrajani <[email protected]>:
>>
>> On Mon, Dec 1, 2014 at 11:23 PM, Lukas Eder <[email protected]> wrote:
>> > There is certainly room for improvement in that area. The two modes of
>> > operation are very similar in the way they're used throughout the API as
>> > both operations simply require String arguments, and String doesn't have
>> > have a semantics per se.
>>
>> Correct. You could require the raw SQL statements to be wrapped in
>> DSL.raw():
>>
>> DSL.select().from(DSL.raw("plain SQL table"))
>
>
> "raw()" by itself would be insufficient, because there are currently four
> different types of objects that can be constructed this way: Schema, Table,
> Field, and Sequence. It would be inevitable to call those methods
> rawSchema(), rawTable(), rawField(), rawSequence() ... or just ... schema(),
> table(), field(), sequence() :-)


DSL.raw() does not need to actually return an escaped string. It could
return an object, something like:

class RawString {
  private String raw;

  public RawString(String raw) {
    this.raw = raw;
  }

  public String escapeForSchema() {
    ...
  }
  public String escapeForTable() {
    ...
  }
  etc.
}

RawString's escape methods would then get called by the API as needed.
DSL.raw() could be seen as a promise to escape your string, once the
context is known.

> but see more on that below...
>
>> vs
>>
>> DSL.drop("tableByName")
>>
>> The advantage of doing this is that:
>> * Forgetting to do something (i.e. forgetting to call DSL.raw) does
>> not lead to potentially unsafe code.
>
>
> That's a good point, although, you're inserting strings into the API. This
> has an implicit feel of "ooh, am I doing this right?" I agree, though that
> terms like "raw" would make users even more alert.

I'm glad we agree on the same things :)

> I personally think that explicitness is most useful with the "*ByName"
> semantics where Strings must only contain schema/table/field/sequence
> identifiers, which is a big feature reduction compared to raw/plain SQL from
> a cognitive perspective. Many people currently get this wrong.
>
> With plain SQL, you will always know what's going on when reading the code,
> though, I suspect? E.g. these two:
>
> DSL.using(configuration)
>    .select(count(VBL_DEFAULT.ID))
>    .from(table("vbl_default partition(p_04-Dec-14)"))
>    .where(...);
>
> and
>
> DSL.using(configuration)
>    .select(count(VBL_DEFAULT.ID))
>    .from("vbl_default partition(p_04-Dec-14)")
>    .where(...);
>
> Specifically, the version where a String is passed to "from" seems
> particularly readable to me.

To me, it's unclear in which case it's ok to use externally controlled
strings and in which case it's not ok. I would love to get the opinion
of all the other people on this list.

>> * If DSL.raw is used consistently, it makes it easy to audit / setup a
>> commit hook / highlight things at code review time / etc.
>
>
> This could also be achieved with a new annotation, which could be checked
> via JSR-308. I've had a very interesting discussion with spec lead Mike
> Ernst recently, about the potential of JSR-308 validation of the current
> @Support annotations to check that only API supported by the underlying
> database is used. He was absolutely thrilled by the idea of doing this in
> jOOQ, he almost forgot his dinner ;-)
>
> In any case, I've registered an issue for adding @PlainSQL annotations:
> https://github.com/jOOQ/jOOQ/issues/3851
>
> No matter how we evolve this, it would be useful to annotated potentially
> dangerous methods this way. We have already added the same SQL injection
> warning in all Javadocs, but formal client-side validation might be even
> better. We'll be doing some research and blog about JSR-308 pretty soon...

http://bunkstrutts.files.wordpress.com/2011/11/corgi-bounci.gif

>> > One way to disambiguate this would be to deprecate the tableByName and
>> > fieldByName methods (and all methods that accept strings with a "Name"
>> > semantics), and offer alternative API methods accepting the
>> > org.jooq.Name
>> > type:
>> >
>> > DSL.table(DSL.name("tableByName"));
>> > DSL.field(DSL.name("fieldByName"), Integer.class);
>>
>> Would you keep the DSL.table(String), DSL.field(String) methods?
>
>
> Yes, probably.
>
>>
>> If yes, I think it's safer to do it the other way around for the reasons
>> I mentioned above.
>
>
> Maybe, but it would be a heavy change and chances are that once you do need
> some String version of an object (raw or just the name), that the name will
> not be sufficient for very long and you will have to switch again to raw
> SQL.

Sorry, I lost you. I think it's fine to have raw SQL in the API, but
its use should be the exception rather than the rule.

> If we fusion your ideas with mine, the safest and most thorough thing would
> be to create AST elements for everything and to gently remove the current
> "dangerous" convenience methods (although I will think about this
> thoroughly, as it is really a big change).

You know jooq better than anyone else to know if this is the right
approach. It feels right to me, but you'll need to find a way to do
this gradually / without breaking too much existing code.

> Pseudo-API:
>
> class DSL {
>
>     // create names for use with any other AST element
>
>     static Name name(String... qualifiedName) { ... }
>
>     // create plain SQL for use with any other AST element
>
>     static SQL sql(String sql) { ... }
>
>     static SQL sql(String sql, Object... bindings) { ... }
>
>     static SQL sql(String sql, QueryPart... parts) { ... }
>
>     // Use Name or SQL in typed AST elements
>
>     static Field<Object> field(Name name) { ... }
>
>     static Field<Object> field(SQL sql) { ... }
>
>     static Table<?> table(Name name) { ... }
>
>     static Table<?> table(SQL sql) { ... }
>
>     static DropTableStep dropTable(Table<?> table) { ... }
>
>     static DropTableStep dropTable(Name name) { ... }
>
> }
>
> interface SelectWhereStep {
>
>     SelectConditionStep where(SQL sql) { ... }
>
> }
>
>
> interface Name extends QueryPart { ... }
>
> interface SQL extends QueryPart { ... }
>
>
> Marked in yellow: The only API elements that really take plain SQL arguments
> - just like your "DSL.raw()".
>
> The advantage of this new SQL type would be the fact that it can immediately
> be embedded in other plain SQL templates, as in
>
> QueryPart anyExpression = DSL.sql("abc + xyz");
>
> DSL.field(DSL.sql("some_function({0})", anyExpression));
>
>
> The same way that Name can already be embedded, or even keywords via
> DSL.keyword()
>
>> > This would make using the jOOQ API a bit more verbose, but it might
>> > still be
>> > acceptable.
>>
>> The API becomes a bit more verbose (an extra method call around a
>> bunch of (hopefully) static strings). However don't forget the API
>> becomes simpler (since you can deprecate/remove tableByName,
>> fieldByName, etc.). Overall it's a reduction in the cognitive work for
>> the library users.
>
>
> Exactly. This confusion popped up too many times, which means that the
> current status quo is more than insufficient from a "cognitive work"
> perspective.
>
> Even if maybe plain SQL should be restricted to only very few methods, all
> Fields should be obtained via overloaded DSL.field(...) methods, Tables
> should be obtained via overloaded DSL.table(...) methods.

Alok

-- 
You received this message because you are subscribed to the Google Groups "jOOQ 
User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Missing double quotes around table and column names

Reply via email to