David Finlayson wrote:

OK, I can deal with the Nulls that in SQL...now, what do you mean that
"even numeric data" is imported as text?

I loaded about 85k records into a database, all numeric, and didn't
notice any problems using max() or stdev(). Should I be concerned
about something?

David,

Sqlite is a lot looser with data types than other databases. It actually assigns a type to each field value, rather than each column. Internally the .import command binds the data it reads from your file to a prepared insert statement. It always binds this data as text. Then it relies on the type conversion rules based on the column's type affinity to actually store the data.

The following shows what happens when some sample data is imported into tables with different column types (note the spaces after the commas in the last three rows).

1,2,3
3.14,1.414,2.718
5+6,7-8,9*9
1, 2, 3
3.14, 1.414, 2.718
5+6, 7-8, 9*9

   SQLite version 3.2.7
   Enter ".help" for instructions
   sqlite> create table t (a, b integer, c real);
   sqlite> .separator ,
   sqlite> .import test.csv t
   sqlite> select * from t;
   1,2,3
   3.14,1.414,2.718
   5+6,7-8,9*9
   1, 2, 3
   3.14, 1.414, 2.718
   5+6, 7-8, 9*9
   sqlite> select typeof(a), typeof(b), typeof(c) from t;
   text,integer,integer
   text,real,real
   text,text,text
   text,text,text
   text,text,text
   text,text,text
   sqlite> select sum(a), sum(b), sum(c) from t;
   18,10.414,14.718

As you can see, only a few cases actually store the data as numeric values, and even then it may store an integer in a real typed column or a real value in an integer typed column. Also many functions, like sum() above, can produce unexpected results.

The mismatches between the declared and actual types of values in a column can also cause problems with math operation such as integer results from integer division between columns that are declared as real (but actually hold integer data), or a real (non-integer) valued sum from a column that is declared integer.

Another problem you may encounter occurs when comparing text and numeric values. A value stored as text '1.50' in an untyped column will not compare equal to a real value 1.5 as shown in the example below (continued with the same table t as above).

   sqlite> delete from t;
   sqlite> insert into t values ('1.50', '2.50', '3.50');
   sqlite> select * from t;
   1.50,2.5,3.5
   sqlite> select typeof(a), typeof(b), typeof(c) from t;
   text,real,real
   sqlite> select a=1.5, b=2.5, c=3.5 from t;
   0,1,1

These issues aren't usually a big problem, but you need to keep them in the back of your mind. However, if you want numeric values in your database you must import into columns declared with some numeric type.

HTH
Dennis Cote

Reply via email to