Re: RDBMS comparison tool

Lyle Fri, 04 Jan 2013 16:18:14 -0800

On 01/01/2013 03:12, Greg Sabino Mullane wrote:

Lyle wrote:

I'm sure there are more things that look a bit odd or out of place. I'm
not sure whether there is good reasoning behind them, or whether the DBD
developers have been doing a best guess and we might possibly want to
consider making things more consistent?

Certainly a best guess for most of the ones in Postgres aka DBD::Pg.
Patches, corrections, suggestions, complaints certainly welcome.
It's a fairly unloved bit of the code.


I see. I've put the comparison tool on github:
https://github.com/cosmicnet/CompareRDBMS

It's pretty light weight, uses CGIApp and HTTP::Server::Simple to createa standalone app you access through your browser. Default port is 8080,but you can change it when you start the server on the command line. Atthe moment you can only configure one database per driver for thecomparison, and the GUI needs some more styling, but it's functional.

PostgreSQL BYTEA that's currently in SQL_VARBINARY, would seem a better
SQL_VARBINARY...

Yeah, one has to think more in terms of the opposite mapping - if someone
asks for SQL_VARBINARY, what do we give them? Technically, BYTEA maps
to both, as Postgres has no (explicit) length limitation. When in doubt,
we use the more "standard" and/or "common" one. In this case, no LONG.

In that case, maybe the MySQL driver's idea of having multiple mappingsfor the same type is the right way. If someone is asking forSQL_LONGVARBINARY then BYTEA is certainly capable.

Similar situation for PostgreSQL character.

Yep.

Reviewing the PostgreSQL documentation on CHARACTER it mentions things
like short, long and very long character strings, but lacks detail so
I've emailed them about it.

Those are internal details about how they are stored and won't affect
anything as far as an application is concerned.

I thought it might be useful to know strings below a certain length arestored uncompressed and so a little faster. Likewise very long stringshave a different storage mechanism one might want to avoid. AlthoughI've only just had a reply to my post asking for specifics, and haven'thad chance to look into it further.

MySQL's FLOAT and DOUBLE are linked to several ODBC types, perhaps
PostgreSQL could do the same? Or is that bad practice on the
MySQL drivers part?

Hard to say, can you be more specific? Keeping in mind that I don't
actually know a whole lot about the SQL/ODBC specs, and the differences
therein. :)

From what I understand, the ISO working group was formed in an attemptto standardise the variety of SQL dialects that were appearing. Thevarious players, such as Oracle, IBM, etc., had people in the workinggroup trying to come to an agreement and form the specs. Of course, theyeach had their own objectives. For many of the features, they alreadyhad their own syntax. Various committee votes decided the officialsyntax, which could be from one source, a mixture, or something new.Then the implementations might adopt the standard way of doing it, ordecide that actually their way was better. If it was a feature they hadalready implemented, they would have legacy syntax to deprecate (orkeep) if they adopted the new standard syntax.

The standard itself doesn't define number ranges for things like INT,those are defined by implementations. So SMALLINT, INT, BIGINT canhappily be the same. SMALLINT and BIGINT were brought in, like othertype variations, because implementations wanted to add new integers withdifferent ranges. So the standard agreed new names, but in the standardseyes they are practically synonyms.


Obviously I'm simplifying things a lot, but you get the idea.

Some, like Chris Date and Hugh Darwen, believe strongly that SQL isfundamentally broken because it breaks Codd's relational theory inseveral ways. But then, when Chamberlin and Boyce originally wrote thepaper defining SQL, their objective wasn't to create a relationallanguage, but a user friendly one that managers could use. Obviously SQLquickly became too complicated for your average manager... But I'mwandering off the point here...

For SQL_DECIMAL PostgreSQL and MySQL seem to disagree on whether FLOAT
or DOUBLE should be used.

Well, a postgres float with no precision is really the same as a
double precision (as you hint at below). The whole thing is quite
confusing when you start pulling in ODBC, SQL standard, IEEE, decimal
vs. binary precision, etc. Have a good link to a doc that explains
exactly what SQL_DECIMAL is meant to be?

I've got a copy of the standard, but I'm pretty sure I'd be breakingsome law if I copied and pasted bits. DECIMAL is supposed to be an"exact numeric". Whereas FLOAT, REAL, DOUBLE PRECISION are "approximatenumeric".

So I guess DOUBLE is a better fit as it's supposed to be more accurate.But neither are a good match really.

PostgreSQL has SQL_DOUBLE associated with it's INT8 (also called
LONGINT) instead of it's FLOAT(25 - 53) or DOUBLE PRECISION which gives
double precision.

Not sure about this one. You might want to take some of this up on
[email protected] or even [email protected].

Now I've published the code for the comparison app and people can runtheir own comparisons I think I will :)

It also has SQL_VARCHAR assoicated with TEXT instead of VARCHAR.

Not sure about this one either - if there was a reason for that,
I don't remember it offhand.

Thanks for doing this work, it's good to see someone poking at
this. +1 for getting the DBDs somewhat consistent, although it
may be tricky as some (looking at you, MySQL!) play more fast and
loose with their data type definitions than others. :)

Yes, the hard part is certainly trying to find consistency, or a way ofmaking them act/emulate some consistency.



Lyle

Re: RDBMS comparison tool

Reply via email to