[sqlite] static sqlite database
Eduardo, I need to do range searches based on id's so R-tree doesn't help, but your suggestion of attaching database sounds useful. I will also look at the size of the page and post my results. On Wed, Jun 10, 2015 at 7:19 PM Eduardo Morras wrote: > On Wed, 10 Jun 2015 14:13:29 + > Igor Stassiy wrote: > > > Thanks for all your replies. Please let me clarify: > > > > I need to do certain range operations efficiently and I prefer SQL > > functionality, plus SQLite gives a cross platform file storage > > format. The table having 3.2 megabytes is just an example. I am > > storing much more (orders of 10^5 of points) however the effect is > > similar. > > > > When I store several tables side by side the effect seems to be worse > > (I assume SQLite leaves the possibility for inserts and hence needs > > to have some space to do them efficiently, please correctly if I am > > wrong). SQLite forums suggest to store tables in separate files to > > avoid vacuum space in between the tables, are there any other methods > > to save space? > > Have you look at R-Tree extension? https://www.sqlite.org/rtree.html > > It creates a table and indexs to store and query latitude and longitude > efficiently. > > Or spatialite? https://en.wikipedia.org/wiki/SpatiaLite > http://www.gaia-gis.it/gaia-sins/ > > It's opensource under MPL. > > A page is dedicated only to one thing, table A, table B, internal sqlite3 > data, index C, whatever... A page can't store data from one table and data > from another table or index. This means that if your page size (default > 1024bytes) is not defined wise enough (f.ex. 64KB=65536byes) and your > tables uses a few bytes (say 128bytes), the rest of the page is empty > (65536-128=65408 bytes) for future table grows. There are page header and > tail with data you must count too, so it's a bit less. Note that you want > to store a big table with data, and the space lost on unwise page size is > minimal. > > You can have empty pages inside the database because you deleted data or > drop an index or a table, but in your static db, it shouldn't happen. In > any case, a vacuum and analyze solve the issue. > > I use the trick to store tables in separate db files and attach them to > main db for speed on mechanical hard disks, because those files on > different disks maximize the iops. An autovacumm and proper db maintenance > should minimize the vacuum space between tables. > > HTH > > --- --- > Eduardo Morras > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >
[sqlite] static sqlite database
On Wed, 10 Jun 2015 14:13:29 + Igor Stassiy wrote: > Thanks for all your replies. Please let me clarify: > > I need to do certain range operations efficiently and I prefer SQL > functionality, plus SQLite gives a cross platform file storage > format. The table having 3.2 megabytes is just an example. I am > storing much more (orders of 10^5 of points) however the effect is > similar. > > When I store several tables side by side the effect seems to be worse > (I assume SQLite leaves the possibility for inserts and hence needs > to have some space to do them efficiently, please correctly if I am > wrong). SQLite forums suggest to store tables in separate files to > avoid vacuum space in between the tables, are there any other methods > to save space? Have you look at R-Tree extension? https://www.sqlite.org/rtree.html It creates a table and indexs to store and query latitude and longitude efficiently. Or spatialite? https://en.wikipedia.org/wiki/SpatiaLite http://www.gaia-gis.it/gaia-sins/ It's opensource under MPL. A page is dedicated only to one thing, table A, table B, internal sqlite3 data, index C, whatever... A page can't store data from one table and data from another table or index. This means that if your page size (default 1024bytes) is not defined wise enough (f.ex. 64KB=65536byes) and your tables uses a few bytes (say 128bytes), the rest of the page is empty (65536-128=65408 bytes) for future table grows. There are page header and tail with data you must count too, so it's a bit less. Note that you want to store a big table with data, and the space lost on unwise page size is minimal. You can have empty pages inside the database because you deleted data or drop an index or a table, but in your static db, it shouldn't happen. In any case, a vacuum and analyze solve the issue. I use the trick to store tables in separate db files and attach them to main db for speed on mechanical hard disks, because those files on different disks maximize the iops. An autovacumm and proper db maintenance should minimize the vacuum space between tables. HTH --- --- Eduardo Morras
[sqlite] static sqlite database
On 2015-06-10 02:45 PM, Igor Stassiy wrote: > Hello, > > I have a question about controlling the size of sqlite database containing > records of the form id -> payload where payload has fixed size and id is an > integer column. > > I am developing a simple storage for points in an sqlite database, the > problem is that space consumption on disk is slightly unsatisfying. Here is > what I mean: > > A program to generate some data for the database: > > #include #include #include > using namespace std; > > int main() { >int n; >cin >> n; > >for(int i = 0; i < n; i++) { > double lat = 180*(rand()*1.0/RAND_MAX); > double lon = 180*(rand()*1.0/RAND_MAX); > int64_t id = rand()*1LL*rand(); > cout << id << "," << lat << "," << lon << "\n"; >} > >return 0;} > > To import the data run: > > g++ filldb.cpp > echo 10 | ./a.out > db.text > sqlite3 test.db > sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL); > sqlite> .separator , > sqlite> .mode csv > sqlite> .import db.text points > > The size of the database right after this shows 3.3M, however the "real" > size of data is 10*(8 bytes)*3 = 2.4M and after > > sqlite> create index points_index on points(id); > > the database size more than doubles to 5.1M > > Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is there > anything else to be done, in particular when the database is static (we > know there wont be any inserts)? A database structure has overheads and SQLite doesn't stick to only saving the bits that are needed (since this is never a requirement) - so you won't ever get the same space utilization as a flat-file might have. You can check exact space usage with the analyzer utility from the downloads page. Some things to note: Changing the id type to INTEGER PRIMARY KEY helps because that makes it an alias for the rowid which is the table's own index of rows and is always Integer. In other words, you remove the need for that column to contain its own set of data, which makes the space needed smaller. Secondly, you may also use a table without a rowid, which would look like this: create table points(id INT, lat REAL, lon REAL) WITHOUT ROWID; Notice also that SQLite has no strong types, it only has affinities, so there is no difference between UNSIGNED INT and INT. They all will be able to be represented as signed 64-bit integers, or indeed anything else you put in such a column (unless it is the rowid alias as described above, which can only hold integer values). More importantly than all the above, the Primary key is essentially an index on the table for id. There is no need to create another index on id. Creating an Index has to put all the data included in the index in a separate sorted B-Tree structure, which in turn will obviously eat up another large amount of space. If you are this concerned with space and the data won't change, why not just save the data in a flatfile? If you like the SQL way of doing things, you can read the file it into a SQLite TEMP table at runtime in Memory without any disk access, and work with it from there. 2MB or 5MB aren't exactly stellar and even the most basic embedded systems would have a memory capacity in excess of that. (And if it doesn't, then SQL isn't really an option to start with). > > Thank you > > Igor > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] static sqlite database
Thanks for all your replies. Please let me clarify: I need to do certain range operations efficiently and I prefer SQL functionality, plus SQLite gives a cross platform file storage format. The table having 3.2 megabytes is just an example. I am storing much more (orders of 10^5 of points) however the effect is similar. When I store several tables side by side the effect seems to be worse (I assume SQLite leaves the possibility for inserts and hence needs to have some space to do them efficiently, please correctly if I am wrong). SQLite forums suggest to store tables in separate files to avoid vacuum space in between the tables, are there any other methods to save space? On Wed, Jun 10, 2015 at 3:51 PM Hick Gunter wrote: > Apart from the fact that latitude and longitude have defined ranges +-90 > and +-180 respectively, why are you choosing SQLite as the storage format > for densely populated single table of constant if all you want > to do is read the values? > > If you are looking for a minimum space, uncompressed format you might as > well memory-map the C structure. > > typedef struct point_t { > double lat,lon; > }; > > typedef struct table_t { > uint64_t cnt; > point_t pnt[1]; > }; > > > > > -Urspr?ngliche Nachricht- > Von: Igor Stassiy [mailto:istassiy at gmail.com] > Gesendet: Mittwoch, 10. Juni 2015 14:45 > An: sqlite-users at mailinglists.sqlite.org > Cc: Kirill Afanasev > Betreff: [sqlite] static sqlite database > > Hello, > > I have a question about controlling the size of sqlite database > containing records of the form id -> payload where payload has fixed size > and id is an integer column. > > I am developing a simple storage for points in an sqlite database, > the problem is that space consumption on disk is slightly unsatisfying. > Here is what I mean: > > A program to generate some data for the database: > > #include #include #include using > namespace std; > > int main() { >int n; >cin >> n; > >for(int i = 0; i < n; i++) { > double lat = 180*(rand()*1.0/RAND_MAX); > double lon = 180*(rand()*1.0/RAND_MAX); > int64_t id = rand()*1LL*rand(); > cout << id << "," << lat << "," << lon << "\n"; >} > >return 0;} > > To import the data run: > > g++ filldb.cpp > echo 10 | ./a.out > db.text > sqlite3 test.db > sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL); > sqlite> .separator , .mode csv .import db.text points > > The size of the database right after this shows 3.3M, however the > "real" > size of data is 10*(8 bytes)*3 = 2.4M and after > > sqlite> create index points_index on points(id); > > the database size more than doubles to 5.1M > > Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is > there anything else to be done, in particular when the database is static > (we know there wont be any inserts)? > > Thank you > > Igor > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org sqlite-users at mailinglists.sqlite.org> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > Gunter Hick > Software Engineer > Scientific Games International GmbH > FN 157284 a, HG Wien > Klitschgasse 2-4, A-1130 Vienna, Austria > Tel: +43 1 80100 0 > E-Mail: hick at scigames.at > > This communication (including any attachments) is intended for the > use of the intended recipient(s) only and may contain information that is > confidential, privileged or legally protected. Any unauthorized use or > dissemination of this communication is strictly prohibited. If you have > received this communication in error, please immediately notify the sender > by return e-mail message and delete all copies of the original > communication. Thank you for your cooperation. > > > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >
[sqlite] static sqlite database
Apart from the fact that latitude and longitude have defined ranges +-90 and +-180 respectively, why are you choosing SQLite as the storage format for densely populated single table of constant if all you want to do is read the values? If you are looking for a minimum space, uncompressed format you might as well memory-map the C structure. typedef struct point_t { double lat,lon; }; typedef struct table_t { uint64_t cnt; point_t pnt[1]; }; -Urspr?ngliche Nachricht- Von: Igor Stassiy [mailto:istassiy at gmail.com] Gesendet: Mittwoch, 10. Juni 2015 14:45 An: sqlite-users at mailinglists.sqlite.org Cc: Kirill Afanasev Betreff: [sqlite] static sqlite database Hello, I have a question about controlling the size of sqlite database containing records of the form id -> payload where payload has fixed size and id is an integer column. I am developing a simple storage for points in an sqlite database, the problem is that space consumption on disk is slightly unsatisfying. Here is what I mean: A program to generate some data for the database: #include #include #include using namespace std; int main() { int n; cin >> n; for(int i = 0; i < n; i++) { double lat = 180*(rand()*1.0/RAND_MAX); double lon = 180*(rand()*1.0/RAND_MAX); int64_t id = rand()*1LL*rand(); cout << id << "," << lat << "," << lon << "\n"; } return 0;} To import the data run: g++ filldb.cpp echo 10 | ./a.out > db.text sqlite3 test.db sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL); sqlite> .separator , .mode csv .import db.text points The size of the database right after this shows 3.3M, however the "real" size of data is 10*(8 bytes)*3 = 2.4M and after sqlite> create index points_index on points(id); the database size more than doubles to 5.1M Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is there anything else to be done, in particular when the database is static (we know there wont be any inserts)? Thank you Igor ___ sqlite-users mailing list sqlite-users at mailinglists.sqlite.org<mailto:sqlite-users at mailinglists.sqlite.org> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: hick at scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.
[sqlite] static sqlite database
Hello, I have a question about controlling the size of sqlite database containing records of the form id -> payload where payload has fixed size and id is an integer column. I am developing a simple storage for points in an sqlite database, the problem is that space consumption on disk is slightly unsatisfying. Here is what I mean: A program to generate some data for the database: #include #include #include using namespace std; int main() { int n; cin >> n; for(int i = 0; i < n; i++) { double lat = 180*(rand()*1.0/RAND_MAX); double lon = 180*(rand()*1.0/RAND_MAX); int64_t id = rand()*1LL*rand(); cout << id << "," << lat << "," << lon << "\n"; } return 0;} To import the data run: g++ filldb.cpp echo 10 | ./a.out > db.text sqlite3 test.db sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL); sqlite> .separator , sqlite> .mode csv sqlite> .import db.text points The size of the database right after this shows 3.3M, however the "real" size of data is 10*(8 bytes)*3 = 2.4M and after sqlite> create index points_index on points(id); the database size more than doubles to 5.1M Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is there anything else to be done, in particular when the database is static (we know there wont be any inserts)? Thank you Igor
[sqlite] static sqlite database
On 6/10/15, Igor Stassiy wrote: > Hello, > > I have a question about controlling the size of sqlite database containing > records of the form id -> payload where payload has fixed size and id is an > integer column. > > I am developing a simple storage for points in an sqlite database, the > problem is that space consumption on disk is slightly unsatisfying. Have you use the sqlite3_analyzer.exe utility (available for download from https://www.sqlite.org/download.html) to see how the space in your database file is being used? -- D. Richard Hipp drh at sqlite.org