On 01/01/2013 03:12, Greg Sabino Mullane wrote:
Lyle wrote:
I'm sure there are more things that look a bit odd or out of place. I'm
not sure whether there is good reasoning behind them, or whether the DBD
developers have been doing a best guess and we might possibly want to
consider making things more consistent?
Certainly a best guess for most of the ones in Postgres aka DBD::Pg.
Patches, corrections, suggestions, complaints certainly welcome.
It's a fairly unloved bit of the code.

I see. I've put the comparison tool on github:
https://github.com/cosmicnet/CompareRDBMS
It's pretty light weight, uses CGIApp and HTTP::Server::Simple to create a standalone app you access through your browser. Default port is 8080, but you can change it when you start the server on the command line. At the moment you can only configure one database per driver for the comparison, and the GUI needs some more styling, but it's functional.

PostgreSQL BYTEA that's currently in SQL_VARBINARY, would seem a better
SQL_VARBINARY...
Yeah, one has to think more in terms of the opposite mapping - if someone
asks for SQL_VARBINARY, what do we give them? Technically, BYTEA maps
to both, as Postgres has no (explicit) length limitation. When in doubt,
we use the more "standard" and/or "common" one. In this case, no LONG.

In that case, maybe the MySQL driver's idea of having multiple mappings for the same type is the right way. If someone is asking for SQL_LONGVARBINARY then BYTEA is certainly capable.

Similar situation for PostgreSQL character.
Yep.
Reviewing the PostgreSQL documentation on CHARACTER it mentions things
like short, long and very long character strings, but lacks detail so
I've emailed them about it.
Those are internal details about how they are stored and won't affect
anything as far as an application is concerned.

I thought it might be useful to know strings below a certain length are stored uncompressed and so a little faster. Likewise very long strings have a different storage mechanism one might want to avoid. Although I've only just had a reply to my post asking for specifics, and haven't had chance to look into it further.

MySQL's FLOAT and DOUBLE are linked to several ODBC types, perhaps
PostgreSQL could do the same? Or is that bad practice on the
MySQL drivers part?
Hard to say, can you be more specific? Keeping in mind that I don't
actually know a whole lot about the SQL/ODBC specs, and the differences
therein. :)

From what I understand, the ISO working group was formed in an attempt to standardise the variety of SQL dialects that were appearing. The various players, such as Oracle, IBM, etc., had people in the working group trying to come to an agreement and form the specs. Of course, they each had their own objectives. For many of the features, they already had their own syntax. Various committee votes decided the official syntax, which could be from one source, a mixture, or something new. Then the implementations might adopt the standard way of doing it, or decide that actually their way was better. If it was a feature they had already implemented, they would have legacy syntax to deprecate (or keep) if they adopted the new standard syntax.

The standard itself doesn't define number ranges for things like INT, those are defined by implementations. So SMALLINT, INT, BIGINT can happily be the same. SMALLINT and BIGINT were brought in, like other type variations, because implementations wanted to add new integers with different ranges. So the standard agreed new names, but in the standards eyes they are practically synonyms.

Obviously I'm simplifying things a lot, but you get the idea.

Some, like Chris Date and Hugh Darwen, believe strongly that SQL is fundamentally broken because it breaks Codd's relational theory in several ways. But then, when Chamberlin and Boyce originally wrote the paper defining SQL, their objective wasn't to create a relational language, but a user friendly one that managers could use. Obviously SQL quickly became too complicated for your average manager... But I'm wandering off the point here...

For SQL_DECIMAL PostgreSQL and MySQL seem to disagree on whether FLOAT
or DOUBLE should be used.
Well, a postgres float with no precision is really the same as a
double precision (as you hint at below). The whole thing is quite
confusing when you start pulling in ODBC, SQL standard, IEEE, decimal
vs. binary precision, etc. Have a good link to a doc that explains
exactly what SQL_DECIMAL is meant to be?

I've got a copy of the standard, but I'm pretty sure I'd be breaking some law if I copied and pasted bits. DECIMAL is supposed to be an "exact numeric". Whereas FLOAT, REAL, DOUBLE PRECISION are "approximate numeric".

So I guess DOUBLE is a better fit as it's supposed to be more accurate. But neither are a good match really.

PostgreSQL has SQL_DOUBLE associated with it's INT8 (also called
LONGINT) instead of it's FLOAT(25 - 53) or DOUBLE PRECISION which gives
double precision.
Not sure about this one. You might want to take some of this up on
dbdpg...@perl.org or even pgsql-gene...@postgresql.org.

Now I've published the code for the comparison app and people can run their own comparisons I think I will :)

It also has SQL_VARCHAR assoicated with TEXT instead of VARCHAR.
Not sure about this one either - if there was a reason for that,
I don't remember it offhand.

Thanks for doing this work, it's good to see someone poking at
this. +1 for getting the DBDs somewhat consistent, although it
may be tricky as some (looking at you, MySQL!) play more fast and
loose with their data type definitions than others. :)

Yes, the hard part is certainly trying to find consistency, or a way of making them act/emulate some consistency.


Lyle

Reply via email to