RE: [sqlite] ticket 1147

Robert Simpson Mon, 28 Feb 2005 07:49:10 -0800

> -----Original Message-----
> From: D. Richard Hipp [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 28, 2005 5:38 AM
> To: [email protected]
> Subject: Re: [sqlite] ticket 1147
> 
> On Mon, 2005-02-28 at 11:12 +0200, Cariotoglou Mike wrote:
> > I understand that this "column names" issue is becoming a 
> pain for the
> > sqlite authors, but OTOH, it is very important for wrapper 
> authors...
> > 
> 
> Why?  Why does anybody care what the column names in the result
> are?  What are the column names used for other than to print a
> header at the top of a table for human-readable output?
> 
> Remember that not all wrappers are done by third parties.
> I ship the TCL wrapper with SQLite and for some reason I
> have never felt the need to turn on long_names or short_names.
> The default column names, whatever they might be, have
> generally been acceptable.
> 
> So what is the difference?  What are all these other
> wrappers doing that requires specific column names formats?


Here's the deal ...

When it comes to enhancing or building infrastructure over the top of
existing open source, consistency is the father of invention.  The problem
with the various column_names pragmas is that their behaviors and intents
have either changed or been broken over the last few updates to the core
engine.  This kind of inconsistency renders the pragmas useless.
Additionally, when you change the naming behaviors of the column names, any
code that depends on their consistent behavior then fails of course.

The crux of the problem for us Schema Dogs is this ...

1.  SQLite parses SQL statements.  It does a really good job.  Logically
SQLite will never execute a SQL statement it doesn't understand or cannot
parse.

2.  SQLite at the time it parses a statement, knows exactly what columns it
is going to return and from where those columns came.

3.  Schema information can be gleaned from a resultset *if the column names
are consistent*.  There are three ways to do this:
    A.  Parse the SQL statement yourself to figure out what columns from
what tables are being returned and match them in order to SQLite's output.
Redundant, error-prone, and not even remotely as good of parsing as what
SQLite can do.  Chances are that your parsed SQL may not even be close to
what actually gets returned from SQLite.
    B.  Let SQLite do the work, but once SQLite has figured out what columns
its returning, we need some way of knowing what table id and column id each
column in the resultset belongs to (if any).
    C.  Parse and try and figure out the EXPLAIN results.  The results from
EXPLAIN will give you everything you need, but it requires you emulating all
128 opcodes and trying to make heads or tails of it all.  NASTY.  Especially
since SQLite already did all the work!

4.  Schema information is important.  Maybe not to some, but it is
nonetheless important.  Lacking any other method of associating a column in
a resultset to a column in a table or view, we are forced into using the
column name to build logic around making that association.  When the column
name is inconsistent and/or pragmas which enhance the column name are
broken, it is an impossible task.

5.  What we do with the schema information or how well we compute it is
irrelevant.  One cannot compute the value of Y in the statement 1 + X = Y
without additional information.  We're not asking for a reason for X or Y's
existence or what we intend to do with Y once we figure it out, we're just
asking for enough information to be able to complete the statement.  Having
column names in a consistent format is tantamount to giving us our X.  

6.  Even if full_column_names was fixed tomorrow, it still requires that we
parse the SQL statement ourselves to try and figure out what columns from
what tables are being returned so we can match it all up.  Which is why I
proposed real_column_names to begin with.  We will never be able to parse
SQL and verify it like SQLite does.

Heck, ANY of the following solutions will give us our missing X! (given in
order of complexity to implement)

1.  PRAGMA real_column_names and about 8 lousy lines of code in
generateColumnNames
2.  A callback function during generateColumnNames that gives us an
opportunity to either set or change the column name, and gives us the column
type (TK_COLUMN, etc) with the database id, table id and column id to which
the column belongs.
3.  Some kind of sqlite3_select_schema() function that takes the SQL text,
parses it and returns the column name, database, table/view name and alias
name for each column in the select.  Any column not bound directly to a
table or view would have null for column name and table/view name, leaving
only the alias.

Robert

RE: [sqlite] ticket 1147

Reply via email to