On 10/26/08, Julian Bui <[EMAIL PROTECTED]> wrote:
> Puneet, I think I see what you're saying about the data types and their
>  affinities, but what does that have to do with the MUCH bigger table size
>  than what was expected?
>

That you were expecting smallint and bigint to behave the way they
should, but SQLite doesn't care about all that. If it sees the term
int in the description (actually, it is not clear from the statement
in the docs that "If the datatype contains the string 'INT' then it is
assigned INTEGER affinity" that smallint would be interpreted as
integer, but, for the moment let us assume so) it makes the column
integer. The size of the column is dependent on the value stored. So,
it can be as large as 8 bytes.

The char(8) also means nothing to SQLite. Soon as it sees "char" in
the description, it makes the column TEXT, and that is as long as it
takes. It is up to you to have your application croak if the length of
the text is longer than 8 characters.

So, per your original post, instead of 28 bytes per row (smallint +
bigint + smallint + double + char(8)), it would actually be (up to 8
bytes + up to 8 bytes + up to 8 bytes + 8 bytes + 8 bytes). That is 40
bytes. I think there is some internal overhead of about a byte or so
per column for SQLite.

Of course, that is still a lot less than your estimate of 2500+ bytes per row.

Are you sure you are storing your timestamp as Julianday (as a float)?
Else, it would be treated as TEXT.

Are you sure you are storing only 8 characters in the char(8) field?

How about dumping the entire database and even reloading it on another
machine to see what is going on?





>  Also, I made a mistake.  In an attempt to censor out my table name and
>  attribute names I forgot to fix everything.  So yes, I am making sure I'm
>  using one table and not two.
>
>  The code ought to be:
>
>
>  <code>
>  //ps is a prepared statement
>  ps = conn.prepareStatement("CREATE table IF NOT EXISTS MY_TABLE(aaa SMALLINT
>  NOT NULL, bbb BIGINT NOT NULL, ccc SMALLINT, ddd DOUBLE, eee CHAR(8));");
>  ps.execute();
>
>  ....
>
>
> ps = conn.prepareStatement("CREATE INDEX IF NOT EXISTS IDX_DDD on MYTABLE
>  (ddd)");
>  ps.execute();
>
>  ...
>
>  dataInsertPs = conn.prepareStatement("INSERT into MY_TABLE(aaa, bbb, ccc,
>  ddd, eee) VALUES (?, ?, ?, ?, ?
>
> //every dataInsertPs gets added to a batch and committed every 1000 records
>
>  </code>
>
>
> On 10/26/08, Julian Bui <julianbui at gmail.com
>
> <http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users>> wrote:
>  >* HI everyone,
>  *>*
>  *>*  I have records in my db that consist of smallint, bigint, smallint, 
> double,
>  *>*  char(8).  By my calculation that comes to 2 + 8 + 2 + 8 + 8 = 28 Bytes 
> per
>  *>*  record.  I also have an index over the attribute that is a double.
>  *
>  Why not start with reading TFM <http://www.sqlite.org/datatype3.html>
>
>  "2.1 Determination Of Column Affinity
>
>  The type affinity of a column is determined by the declared type of
>  the column, according to the following rules:
>
>  If the datatype contains the string "INT" then it is assigned INTEGER 
> affinity.
>
>  If the datatype of the column contains any of the strings "CHAR",
>  "CLOB", or "TEXT" then that column has TEXT affinity. Notice that the
>  type VARCHAR contains the string "CHAR" and is thus assigned TEXT
>  affinity.
>
>  If the datatype for a column contains the string "BLOB" or if no
>  datatype is specified then the column has affinity NONE.
>
>  If the datatype for a column contains any of the strings "REAL",
>  "FLOA", or "DOUB" then the column has REAL affinity
>
>  Otherwise, the affinity is NUMERIC."
>
>  What you expected was not what you got. That was expected.
>
>  >*
>  *>*  I inserted 100,000 records into a clean database and the database grew 
> to
>  *>*  240.0MB.  That comes out to 2519 Bytes per record (240 * 1024 * 1024 /
>  *>*  100,000).  How is that possible?  The index can't add THAT MUCH
>  extra space.
>  *>*
>  *>*  Results were duplicated when I started with a clean database and 
> inserted
>  *>*  other # of records.
>  *>*
>  *>*  This is very surprising.  I must be doing something wrong.
>  *>*
>  *>*  I'm attaching some code.  Also to note, I'm using the xerial JDBC 
> driver to
>  *>*  do this in java.
>  *>*
>  *>*  If anyone knows what may be wrong I'd love to hear feedback.
>  *>*
>  *>*  Thanks,
>  *>*  Julian
>  *>*
>  *>*  <code>
>  *>*  //ps is a prepared statement
>  *>*  ps = conn.prepareStatement("CREATE table IF NOT EXISTS
>  MY_TABLE(aaa SMALLINT
>  *>*  NOT NULL, bbb BIGINT NOT NULL, ccc SMALLINT, ddd DOUBLE, eee 
> CHAR(8));");
>  *>*  ps.execute();
>  *>*
>  *>*  ....
>  *>*
>  *>*  ps = conn.prepareStatement("CREATE INDEX IF NOT EXISTS IDX_TIME on
>  *>*  CAN_MESSAGES (timeStamp)");
>  *>*  ps.execute();
>  *>*
>  *>*  ...
>  *>*
>  *>*  dataInsertPs = conn.prepareStatement("INSERT into CAN_MESSAGES(lru, pgn,
>  *>*  bus, timeStamp, messageData) VALUES (?, ?, ?, ?, ?
>  *>*  //every dataInsertPs gets added to a batch and committed every 1000 
> records
>  *>*
>  *>*  </code>
>
> *>*  _______________________________________________
>  *>*  sqlite-users mailing list
>  *>*  sqlite-users at sqlite.org
>
> <http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users>
>
> *>*  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>  *>*
>
> *
>
>  --
>  Puneet Kishor http://punkish.eidesis.org/
>  Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
>  Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
>
> _______________________________________________
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Puneet Kishor http://punkish.eidesis.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to