Ralf Junker wrote:
This can be fixed by checking the column affinity for a value when it is
stored. If an integer value is being stored in a column with numeric affinity,
then store the value as a REAL value rather than as an INTEGER value. This will
perform the same conversion that the other engines do, and hence produce the
same result when the division operation is performed later.
Unfortunately, this is not true for SQLite3. If you execute the following SQL
commands
CREATE TABLE t (a REAL, b REAL);
INSERT INTO t VALUES (5, 2);
both values are stored as INTEGER and not as REAL as one would expect by the
column affinity. In fact, this behaviour is intentional and is stated in the
help as a feature to minimize storage space for INTEGER numbers.
I'm sorry I took so long to get back to, I started this reply last
night, but had to leave it unfinished until now.
To clarify what I said above (since it obviously wasn't clear).
The first thing to note is that SQLite applies a data type to individual
values, and data type affinities to table columns. These are different
things.
I meant that your division problem can be fixed in SQLite if SQLite is
modified so that it checks the affinity of the column before storing the
integer value. In SQLite a column declared as REAL has an affinity of
NUMERIC (see http://www.sqlite.org/datatype3.html section 2.1). The
problem is that currently columns with NUMERIC affinity can store values
of any type, in particular it can store both INTEGER and REAL (see
section 1 of the data types documentation). The required change would be
to modify SQLite so that it would convert INTEGER values to REAL values
when they are stored in a column with NUMERIC affinity.
This would make your division work as expected. The integer value
supplied to the insert statement would be converted to real and stored
as real in the table. When the select statement later retrieves the
values to do the division, it will have real values and do floating
point path to produce a real result.
It has some side effects that others may not like however. Storing a
value of 5 into the table would return a value of 5.0 when selected,
since SQLite (at least the newest versions) formats REAL values with a
decimal point on output. It would also make the database files larger if
they were storing integer values in a typeless column or one with a type
that produced NUMERIC affinity (again see the data types doc for the
rules).
If a user really wants to force values to be stored and returned as
integers they would simply need to explicitly declare the columns to
have an INT type, so they would be assigned an INTEGER affinity. Then no
conversion would be done when the values are stored. To me this seems
like an optimization for both speed and space that the user can trigger
by explicitly specifying the column's data type.
To force storing numbers as REALs SQLite3 requires to use data binding or to
rewrite the 2nd command above like this:
INSERT INTO t VALUES (5.0, 2.0);
In other words, SQLite3 requires REALs to be explicitly written as such. Many users are certainly not aware of this requirement. They expect REALs to be stored as REALs just like other DB-engines do if a column is defined as such. However, SQLite3 behaves differently and this can
* result in wrong calculations / different than intended.
* lead to data errors when importing SQL scripts from other DBMS.
On the other INTEGER side, SQLite3 again behaves differently:
CREATE TABLE i (a INTEGER, b INTEGER);
INSERT INTO i VALUES (5, 2);
INSERT INTO i VALUES (5.0, 2.0);
Both INSERTs above store the numbers as INTEGERs, even those explicitly marked
as REALs by adding the '.0' decimal.
This behavior is also documented on the data type page. If the REAL
value can be represented exactly as an INTEGER, an INTEGER is stored in
columns declared to be of type INT. If not, the REAL value is stored.
This seems reasonable to me. It is in some ways the complement to the
change I proposed above.
If you do this in a standard SQL engine it will silently convert the 5.0
value to an exact value with implementation defined precision. Usually
not a problem for 5.0, but I suspect that 5.1 will probably come out of
the INTEGER column with the same value as 5.0 does. In SQLite, it will
store the 5.1 value as a REAL value even though it is in a column with
INTEGER affinity.
Another problem shows when calculations are not performed on colums with type
affinity but just on plain numbers like in
SELECT 5 / 2;
What exactly is intended? From the findings above, both numbers might be seen
as INTEGERs, but if '5.0' is interpreted as an INTEGER above, '5' might just as
well be a REAL.
Arguing from "common sense" I would prefer 5 / 2 to return 2.5 rather than 2.
If the result cannot be represented with exact numeric, I would rather want it to be
aproximate only. I can not overcome the feeling that exact numeric is sometimes simply
wrong, especially with the whole family of financial calculations.
This means what the user typed. Return 5 divided by 2. Each of the
literal values also has a type, in this case both are integer. The SQL
standard requires that division of two integer values should return an
exact result. Since integer is the only exact numeric type that SQLite
has, and the precision and scale of the resulting exact numeric type are
implementation defined in the standard, using an integer result (which
has a scale of 0 and precision of 9 for 32 bits) conforms to the standard.
I'm somewhat surprised that Oracle and mySQL both depart from the
standard here, and at the same time relieved to see that PostgreSQL
(which is generally regarded as the most standard conforming database
engine) does what the standard says. It is possible that the others are
using some other exact numeric result type to display the result, but I
suspect not.
It is unfortunate that the result value used for this discussion, 2.5,
has an exact representation in binary floating point. It makes it hard
to tell how the value is being represented. If we use 10/3 we get a
result that is not exactly representable in binary floating point (or
decimal for that matter), so it might be easier to see how the value is
represented. In SQLite we can use the typeof function to get the type of
the result.
select typeof(5/2);
integer
It would be interesting to see what type the other database engines
return if they have a similar facility.
In any case, I think that sqlite should continue to conform to the SQL
standard in this regard. I appreciate that many languages, like Python,
do this differently, but they are following a different standard.