Re: column_info () vs type_info () (Summary)

Martin Evans Wed, 10 Mar 2010 02:26:55 -0800

Martin Evans wrote:
> H.Merijn Brand wrote:
>> On Mon, 08 Mar 2010 10:13:02 +0000, Martin Evans
>> <martin.ev...@easysoft.com> wrote:
>>
>> large original chunks snipped ...
>>
>>> H.Merijn Brand wrote:
>>>> I see a big difference in what $sth->{TYPE} returns (and the name) and
>>>> what column_info () - if implemented - is returning.
>>> I don't think I do with DBD::ODBC (results below).
>>>
>>>> DATA_TYPE has no specification of what type of code that is. It can be
>>>> either the code the type is internally known by with the database, or
>>>> it can be the ODBC equivalent.
>>>>
>>>> TYPE_NAME has no guarantee whatsoever to be like what type_info ()
>>>> returns with code like:
>>> I thought it should.
>>>
>>>> --8<---
>>>> {   my %types;     # Cache for types
>>>>
>>>>     # Convert numeric to readable
>>>>     sub _type_name
>>>>     {
>>>>    my $type = shift;
>>>>
>>>>    unless (exists $types{$dbh}{$type}) {
>>>>        my $tpi = $type =~ m/^-?[0-9]+$/ ? $dbh->type_info ($type) : undef;
>>>>        $types{$dbh}{$type} = $tpi ? $tpi->{TYPE_NAME} : $type // "?";
>>>>        }
>>>>    return $types{$dbh}{$type};
>>>>    } # type_name
>>>>     }
>>>> -->8---
>>>>
>>>> The keys in the hashref returned from column_info () often do not honor
>>>> the {FetchHashKeyName} dbh attribute, which makes it quite a bit harder
>>>> to write database-independent code. I think either document that the
>>>> sth returned from column_info () doesn't have to follow this attribute,
>>>> or make the authors alter the code so it does.
>>> I guess you are mostly referring to the 'COLUMN_NAME', 'TABLE_NAME',
>>> 'TABLE_SCHEM' and 'TABLE_CAT' keys - yes?
>> Yes, but esp the *extra* fields returned. FetchHaskKeyNames refers to
>> the date returned in the hashref. The 4 you name are normally provided
>> to column_info () and not the ones you want to examine. What I mean is
>> *all* the keys, so also DATA_TYPE, TYPE_NAME, COLUMN_SIZE,
>> BUFFER_LENGTH, DECIMAL_DIGITS, NUM_PREC_RADIX, NULLABLE, REMARKS,
>> COLUMN_DEF, SQL_DATA_TYPE, SQL_DATETIME_SUB, CHAR_OCTET_LENGTH,
>> ORDINAL_POSITION, IS_NULLABLE. etc
>>
>> For example, in MySQL, a hash like this is returned:
>>
>> {   BUFFER_LENGTH    => undef,
>>     CHAR_OCTET_LENGTH => undef,
>>     CHAR_SET_CAT     => undef,
>>     CHAR_SET_NAME    => undef,
>>     CHAR_SET_SCHEM   => undef,
>>     COLLATION_CAT    => undef,
>>     COLLATION_NAME   => undef,
>>     COLLATION_SCHEM  => undef,
>>     COLUMN_DEF       => undef,
>>     COLUMN_NAME      => 'xbb',
>>     COLUMN_SIZE      => 20,
>>     DATA_TYPE        => 4,
>>     DECIMAL_DIGITS   => undef,
>>     DOMAIN_CAT       => undef,
>>     DOMAIN_NAME      => undef,
>>     DOMAIN_SCHEM     => undef,
>>     DTD_IDENTIFIER   => undef,
>>     IS_NULLABLE      => 'NO',
>>     IS_SELF_REF      => undef,
>>     MAX_CARDINALITY  => undef,
>>     NULLABLE         => 0,
>>     NUM_PREC_RADIX   => 10,
>>     ORDINAL_POSITION => 1,
>>     REMARKS          => undef,
>>     SCOPE_CAT        => undef,
>>     SCOPE_NAME       => undef,
>>     SCOPE_SCHEM      => undef,
>>     SQL_DATA_TYPE    => 4,
>>     SQL_DATETIME_SUB => undef,
>>     TABLE_CAT        => undef,
>>     TABLE_NAME       => 'xbb',
>>     TABLE_SCHEM      => undef,
>>     TYPE_NAME        => 'BIGINT',
>>     UDT_CAT          => undef,
>>     UDT_NAME         => undef,
>>     UDT_SCHEM        => undef,
>>     mysql_is_auto_increment => 1,
>>     mysql_is_pri_key => 1,
>>     mysql_type_name  => 'bigint(20) unsigned',
>>     mysql_values     => undef
>>     }
>>
>> and FetchHashKeyName was set to "NAME_lc", which IMHO should have
>> returned ALL keys lowercase.
> 
> I see what you mean now.
> 
> I don't see why FetchHashKeyName should affect those keys - I believed
> it was for the cases where keys are column names. Those keys are part of
> the DBI spec not some variable thing depending on database e.g., whether
> case is maintained on a column name or not.
> 
> 
>>>> Extra fun comes from databases that store type names instead of type
>>>> codes in their data-dictionary (like Unify and SQLite), and reversing
>>>> that process to make column_info () return both TYPE_NAME and DATA_TYPE
>>>> makes it a different pair than TYPE and the derived counterpart from
>>>> type_info ().
>>>>
>>>> My real question is, should the docs be enhanced to
>>>>
>>>> • make clear that these two return different things
>>> or make them return the same things. Obviously for ODBC this is simple
>>> as they are the same things but for other DBDs I think it is useful to
>>> know a single type that can be used across all databases and the real
>>> type implemented in the database (and be able to map between them) -
>>> from your results mysql looks closest in this respect.
>> indeed, and I was planning to make the two columns identical for Unify
>> too. But the docs did not force me to do so, and making translation
>> tables is causing hardcoded code, which is harder to maintain.
> 
> I can see this is not ideal for you.
> 
>>> People writing bugzilla, open LDAP etc backend support in databases are
>>> having to hand code the schema for each database but in many cases it
>>> may be possible (if the DBDs returned a single set of types) to code
>>> this generically (although I'd guess it would be still quite hard).
>>>
>>>> • column_info () is not always available (sth is undef then)
>>> I guess so.
>>>
>>> It should not be difficult to add column_info to DBD::Oracle - I know
>>> this has come up in the past. I think I even provided some SQL that
>>> would do it but I cannot find it right now.
>> I did the initial implementation for Unify in about 30 minutes, but
>> then I ended up finding these strange behaviours and digged into all
>> the other databases.
>>
>>>> Here's my findings so far ...
>>>>
>>>> PostgreSQL
>>>>   Create as               sth attributes              column_info ()
>>>>   ----------------------- --------------------------  
>>>> -----------------------------------
>>>>   bigint                  ?                       -5  bigint               
>>>>             -5
>>>>   bigserial               ?                       -5  bigint               
>>>>             -5
>>>> :
>>>>
>>>> DBD::Oracle does not support column_info ()
>>>>
>>>> Oracle
>>>>   Create as               sth attributes              column_info ()
>>>>   ----------------------- --------------------------  
>>>> -------------------------
>>>>   bfile                   ?                    -9114  -
>>>>   blob                    ?                       30  -
>>>> :
>>>>
>>>> DBD::Unify will have column_info () in the next release. I'm all open
>>>> for changes.
>>>>
>>>> Unify
>>>>   Create as               sth attributes              column_info ()
>>>>   ----------------------- --------------------------  
>>>> -----------------------------------
>>>>   amount                  FLOAT                    6  AMOUNT               
>>>>           -206
>>>>   amount (5, 2)           FLOAT                    6  AMOUNT               
>>>>           -206
>>>> :
>>>>
>>>> MySQL
>>>>   Create as               sth attributes              column_info ()
>>>>   ----------------------- --------------------------  
>>>> -----------------------------------
>>>>   serial                  bigint                  -5  BIGINT               
>>>>              4
>>>>   tinyint                 tinyint                 -6  TINYINT              
>>>>              4
>>>> :
>>>>
>>>> SQLite does not return handle TYPE attribute attributes :(
>>>>
>>>> SQLite
>>>>   Create as               sth attributes              column_info ()
>>>>   ----------------------- --------------------------  
>>>> -----------------------------------
>>>>   int                     ?                        0  int                  
>>>>              ?
>>>>   integer                 ?                        0  integer              
>>>>              ?
>>>> :
>>> and here are mine with MS SQL Server (and our driver) - note there may
>>> be a small issue with the XML type here - I have passed this on to the
>>> person responsible. I've skipped multiple identity columns but could
>>> provide more if you really want it.
>>>
> 
> <snipped my sql server results>
> 
>>> I seem to remember a similar issue with the lack of column_info came up
>>> on Perl monks recently with DBD::Interbase. With the increasing use of
>>> modules like DBIx::Class etc the meta data methods are becoming a lot
>>> more important than they perhaps used to be.
> 
> Martin


H.Merijn Brand and I have been talking this over on #dbi and Tim asked
for a summary. I don't want to misrepresent what Merijn said, afterall
he started the thread but I think it boils down to these issues:

1. the TYPE attribute on a statement is clearly documented as to what it
should contain - "The values correspond to the international standards
(ANSI X3.135 and ISO/IEC 9075) which, in general terms, means ODBC".

However, it is not clear from the docs what TYPE_NAME and DATA_TYPE
columns should be in the column_info method and how they compare with
the same named columns returned by the type_info method.

e.g., for the column_info method:

DATA_TYPE: The concise data type code
  this could be the database internal type number or the ODBC type
  the current opinion is that it should be the ODBC type and extra
  keys added for the internal type (e.g., ora_type, uni_type depending
  on DBD prefix) NOTE mysql already adds "mysql_is_auto_increment",
  "mysql_is_pri_key", "mysql_type_name", "mysql_values" keys.
TYPE_NAME: A data source dependent data type name.

for type_info:

DATA_TYPE (integer) SQL data type number.
TYPE_NAME (string) Data type name for use in CREATE TABLE statements

Proposed Solution: document column_info DATA_TYPE as being the same as
{TYPE}  but allow DBDs to add other keys to the column_info result (some
already do).

2. There is no guarantee that if you find a TYPE_NAME in column_info you
can map it successfully to the type in type_info - this is annoying and
difficult to workaround.

Proposed Solution: document they should be the same

3. column_info is not always provided by a DBD and the documentation
fails to mention that in this case the returned statement handle is undef.

Proposed Solution: update documentation

4. It appears FetchHashKeyName is not honoured in the results of
column_info though how this occurs does appear to depend on the DBD.
Merijn got all upper case keys from MySQL when FetchHashKeyName =
name_lc e.g.,

{   BUFFER_LENGTH    => undef, # <-- should have been lowercase
    CHAR_OCTET_LENGTH => undef,# <-- should have been lowercase
    .
    .
}

and with DBD::ODBC I find the column names follow FetchHashKeyName but
not the key columns passed to fetchXXX_arrayref e.g.,

with NAME_lc I get:

$VAR1 = {
          'PSDATE' => { # <-- these are as the database returns them
                        # and not lowercased here
                        'char_octet_length' => undef, # correctly cased

and name_uc I get:

$VAR1 = {
          'PSDATE' => { # correctly cased by accident - as db returned
                        'DECIMAL_DIGITS' => '3', # correctly cased

This does not altogether surprise me since in DBD::ODBC's case the
result-set is generated by the ODBC driver not by DBD::ODBC.

Proposed Solution: The MySQL case is plain wrong (needs fixing), the
DBD::ODBC is only wrong if case of the data_type_name is supposed to
follow FetchHashKeyName as well. What should it be?

Martin
-- 
Martin J. Evans
Easysoft Limited
http://www.easysoft.com

Re: column_info () vs type_info () (Summary)

Reply via email to