Re: RDBMS comparison tool

Lyle Sat, 05 Jan 2013 05:40:18 -0800

On 05/01/2013 10:41, Darren Duncan wrote:

My understanding about Postgres either compressing strings or using"toast" segments for longer ones is that this is just an internalimplementation detail and that user-facing concepts like data typesshould be ignoring these differences. Character data is just "text",a sequence of characters of arbitrary length, and that's all there isto it.


It does indeed:
http://www.postgresql.org/docs/9.2/static/storage-toast.html

So the PostgreSQL idea of char, varchar, text, they can all store up toaround 1gb.I'm currently trying to find some sort of equivalence between the RDBMStypes. For PostgreSQL, the distinction between what they might call charand clob are hidden from the user, but they do kind of exist internally.

I've got a copy of the standard, but I'm pretty sure I'd be breakingsome law ifI copied and pasted bits. DECIMAL is supposed to be an "exactnumeric". Whereas
FLOAT, REAL, DOUBLE PRECISION are "approximate numeric".
So I guess DOUBLE is a better fit as it's supposed to be moreaccurate. But
neither are a good match really.
Here's the thing. The most important difference is "exact numeric"versus "approximate numeric". Your type list should clearly andexplicitly separate these into separate rows from each other, andnever cross types between them.

Yes, there is good argument for this. I guess it would help to knowexactly what the DBD type_info is used for. I'm not sure if it has someinternal use, or whether it's just something to give to users who areasking for particular types.

Things like integers or rationals or DECIMAL etc are exact numeric.
Things like FLOAT and DOUBLE are analogous to IEEE floats which areinexact and are defined by using a certain number of bits to storeevery value of their type.
I don't recall which of the above REAL goes in.


REAL is approximate, inexact.

If different DBMSs use the same FOO to be exact in one or approximatein another, still keep them in different rows.
Since the SQL standard, as well as some other programming languages,define "decimal" as being exact numeric, then it is absolutely wrongto map them to either FLOAT or DOUBLE. In fact, in Perl 5 the onlynative type that can map to a DECIMAL is a character string of numericdigits. Don't shoehorn this. There is no good reason to map DECIMALto a floating point type.

I'm not overly familiar with Perl's internal handling of number. I guessif you have DECIMAL from a character string Perl will switch it out toan approximate the moment you do a calculation on it. Furthermore if theDBI (or the DBDs, I'm not sure where the distinction lies) is alreadyputting it into a Perl decimal which is floating point, then the battlehas already been lost before it gets to me.

Likewise, under no circumstance, map integers to floating types.
It also has SQL_VARCHAR assoicated with TEXT instead of VARCHAR.
Not sure about this one either - if there was a reason for that,
I don't remember it offhand.

Thanks for doing this work, it's good to see someone poking at
this. +1 for getting the DBDs somewhat consistent, although it
may be tricky as some (looking at you, MySQL!) play more fast and
loose with their data type definitions than others. :)
Yes, the hard part is certainly trying to find consistency, or a wayof making
them act/emulate some consistency.
Since in Postgres a VARCHAR is just a restricted TEXT, maplength-restricted character types to VARCHAR and unrestricted oneslike Perl strings to TEXT, that would carry the best semantics.

PostgreSQL do recommend using VARCHAR instead of CHAR. Due to the waythey are implemented, VARCHAR is usually more efficient.



Lyle

Re: RDBMS comparison tool

Reply via email to