Ralf Junker wrote:

This can be fixed by checking the column affinity for a value when it is 
stored. If an integer value is being stored in a column with numeric affinity, 
then store the value as a REAL value rather than as an INTEGER value. This will 
perform the same conversion that the other engines do, and hence produce the 
same result when the division operation is performed later.

Unfortunately, this is not true for SQLite3. If you execute the following SQL 
commands

 CREATE TABLE t (a REAL, b REAL);
 INSERT INTO t VALUES (5, 2);

both values are stored as INTEGER and not as REAL as one would expect by the 
column affinity. In fact, this behaviour is intentional and is stated in the 
help as a feature to minimize storage space for INTEGER numbers.

I'm sorry I took so long to get back to, I started this reply last night, but had to leave it unfinished until now.

To clarify what I said above (since it obviously wasn't clear).

The first thing to note is that SQLite applies a data type to individual values, and data type affinities to table columns. These are different things.

I meant that your division problem can be fixed in SQLite if SQLite is modified so that it checks the affinity of the column before storing the integer value. In SQLite a column declared as REAL has an affinity of NUMERIC (see http://www.sqlite.org/datatype3.html section 2.1). The problem is that currently columns with NUMERIC affinity can store values of any type, in particular it can store both INTEGER and REAL (see section 1 of the data types documentation). The required change would be to modify SQLite so that it would convert INTEGER values to REAL values when they are stored in a column with NUMERIC affinity.

This would make your division work as expected. The integer value supplied to the insert statement would be converted to real and stored as real in the table. When the select statement later retrieves the values to do the division, it will have real values and do floating point path to produce a real result.

It has some side effects that others may not like however. Storing a value of 5 into the table would return a value of 5.0 when selected, since SQLite (at least the newest versions) formats REAL values with a decimal point on output. It would also make the database files larger if they were storing integer values in a typeless column or one with a type that produced NUMERIC affinity (again see the data types doc for the rules).

If a user really wants to force values to be stored and returned as integers they would simply need to explicitly declare the columns to have an INT type, so they would be assigned an INTEGER affinity. Then no conversion would be done when the values are stored. To me this seems like an optimization for both speed and space that the user can trigger by explicitly specifying the column's data type.

To force storing numbers as REALs SQLite3 requires to use data binding or to 
rewrite the 2nd command above like this:

 INSERT INTO t VALUES (5.0, 2.0);

In other words, SQLite3 requires REALs to be explicitly written as such. Many users are certainly not aware of this requirement. They expect REALs to be stored as REALs just like other DB-engines do if a column is defined as such. However, SQLite3 behaves differently and this can
* result in wrong calculations / different than intended.
* lead to data errors when importing SQL scripts from other DBMS.

On the other INTEGER side, SQLite3 again behaves differently:

 CREATE TABLE i (a INTEGER, b INTEGER);
 INSERT INTO i VALUES (5, 2);
 INSERT INTO i VALUES (5.0, 2.0);

Both INSERTs above store the numbers as INTEGERs, even those explicitly marked 
as REALs by adding the '.0' decimal.

This behavior is also documented on the data type page. If the REAL value can be represented exactly as an INTEGER, an INTEGER is stored in columns declared to be of type INT. If not, the REAL value is stored. This seems reasonable to me. It is in some ways the complement to the change I proposed above.

If you do this in a standard SQL engine it will silently convert the 5.0 value to an exact value with implementation defined precision. Usually not a problem for 5.0, but I suspect that 5.1 will probably come out of the INTEGER column with the same value as 5.0 does. In SQLite, it will store the 5.1 value as a REAL value even though it is in a column with INTEGER affinity.

Another problem shows when calculations are not performed on colums with type 
affinity but just on plain numbers like in

 SELECT 5 / 2;

What exactly is intended? From the findings above, both numbers might be seen 
as INTEGERs, but if '5.0' is interpreted as an INTEGER above, '5' might just as 
well be a REAL.

Arguing from "common sense" I would prefer 5 / 2 to return 2.5 rather than 2. 
If the result cannot be represented with exact numeric, I would rather want it to be 
aproximate only. I can not overcome the feeling that exact numeric is sometimes simply 
wrong, especially with the whole family of financial calculations.


This means what the user typed. Return 5 divided by 2. Each of the literal values also has a type, in this case both are integer. The SQL standard requires that division of two integer values should return an exact result. Since integer is the only exact numeric type that SQLite has, and the precision and scale of the resulting exact numeric type are implementation defined in the standard, using an integer result (which has a scale of 0 and precision of 9 for 32 bits) conforms to the standard.

I'm somewhat surprised that Oracle and mySQL both depart from the standard here, and at the same time relieved to see that PostgreSQL (which is generally regarded as the most standard conforming database engine) does what the standard says. It is possible that the others are using some other exact numeric result type to display the result, but I suspect not.

It is unfortunate that the result value used for this discussion, 2.5, has an exact representation in binary floating point. It makes it hard to tell how the value is being represented. If we use 10/3 we get a result that is not exactly representable in binary floating point (or decimal for that matter), so it might be easier to see how the value is represented. In SQLite we can use the typeof function to get the type of the result.

select typeof(5/2);
integer

It would be interesting to see what type the other database engines return if they have a similar facility.

In any case, I think that sqlite should continue to conform to the SQL standard in this regard. I appreciate that many languages, like Python, do this differently, but they are following a different standard.







Reply via email to