[sqlite] static sqlite database

2015-06-11 Thread Igor Stassiy
Eduardo, I need to do range searches based on id's so R-tree doesn't help,
but your suggestion of attaching database sounds useful. I will also look
at the size of the page and post my results.

On Wed, Jun 10, 2015 at 7:19 PM Eduardo Morras  wrote:

> On Wed, 10 Jun 2015 14:13:29 +
> Igor Stassiy  wrote:
>
> > Thanks for all your replies. Please let me clarify:
> >
> > I need to do certain range operations efficiently and I prefer SQL
> > functionality, plus SQLite gives a cross platform file storage
> > format. The table having 3.2 megabytes is just an example. I am
> > storing much more (orders of 10^5 of points) however the effect is
> > similar.
> >
> > When I store several tables side by side the effect seems to be worse
> > (I assume SQLite leaves the possibility for inserts and hence needs
> > to have some space to do them efficiently, please correctly if I am
> > wrong). SQLite forums suggest to store tables in separate files to
> > avoid vacuum space in between the tables, are there any other methods
> > to save space?
>
> Have you look at R-Tree extension? https://www.sqlite.org/rtree.html
>
> It creates a table and indexs to store and query latitude and longitude
> efficiently.
>
> Or spatialite? https://en.wikipedia.org/wiki/SpatiaLite
> http://www.gaia-gis.it/gaia-sins/
>
> It's opensource under MPL.
>
> A page is dedicated only to one thing, table A, table B, internal sqlite3
> data, index C, whatever... A page can't store data from one table and data
> from another table or index. This means that if your page size (default
> 1024bytes) is not defined wise enough (f.ex. 64KB=65536byes) and your
> tables uses a few bytes (say 128bytes), the rest of the page is empty
> (65536-128=65408 bytes) for future table grows. There are page header and
> tail with data you must count too, so it's a bit less. Note that you want
> to store a big table with data, and the space lost on unwise page size is
> minimal.
>
> You can have empty pages inside the database because you deleted data or
> drop an index or a table, but in your static db, it shouldn't happen. In
> any case, a vacuum and analyze solve the issue.
>
> I use the trick to store tables in separate db files and attach them to
> main db for speed on mechanical hard disks, because those files on
> different disks maximize the iops. An autovacumm and proper db maintenance
> should minimize the vacuum space between tables.
>
> HTH
>
> ---   ---
> Eduardo Morras 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] static sqlite database

2015-06-10 Thread Eduardo Morras
On Wed, 10 Jun 2015 14:13:29 +
Igor Stassiy  wrote:

> Thanks for all your replies. Please let me clarify:
> 
> I need to do certain range operations efficiently and I prefer SQL
> functionality, plus SQLite gives a cross platform file storage
> format. The table having 3.2 megabytes is just an example. I am
> storing much more (orders of 10^5 of points) however the effect is
> similar.
> 
> When I store several tables side by side the effect seems to be worse
> (I assume SQLite leaves the possibility for inserts and hence needs
> to have some space to do them efficiently, please correctly if I am
> wrong). SQLite forums suggest to store tables in separate files to
> avoid vacuum space in between the tables, are there any other methods
> to save space?

Have you look at R-Tree extension? https://www.sqlite.org/rtree.html

It creates a table and indexs to store and query latitude and longitude 
efficiently. 

Or spatialite? https://en.wikipedia.org/wiki/SpatiaLite 
http://www.gaia-gis.it/gaia-sins/

It's opensource under MPL.

A page is dedicated only to one thing, table A, table B, internal sqlite3 data, 
index C, whatever... A page can't store data from one table and data from 
another table or index. This means that if your page size (default 1024bytes) 
is not defined wise enough (f.ex. 64KB=65536byes) and your tables uses a few 
bytes (say 128bytes), the rest of the page is empty (65536-128=65408 bytes) for 
future table grows. There are page header and tail with data you must count 
too, so it's a bit less. Note that you want to store a big table with data, and 
the space lost on unwise page size is minimal. 

You can have empty pages inside the database because you deleted data or drop 
an index or a table, but in your static db, it shouldn't happen. In any case, a 
vacuum and analyze solve the issue.

I use the trick to store tables in separate db files and attach them to main db 
for speed on mechanical hard disks, because those files on different disks 
maximize the iops. An autovacumm and proper db maintenance should minimize the 
vacuum space between tables.

HTH

---   ---
Eduardo Morras 


[sqlite] static sqlite database

2015-06-10 Thread R.Smith


On 2015-06-10 02:45 PM, Igor Stassiy wrote:
> Hello,
>
> I have a question about controlling the size of sqlite database containing
> records of the form id -> payload where payload has fixed size and id is an
> integer column.
>
> I am developing a simple storage for points in an sqlite database, the
> problem is that space consumption on disk is slightly unsatisfying. Here is
> what I mean:
>
> A program to generate some data for the database:
>
> #include #include #include 
> using namespace std;
>
> int main() {
>int n;
>cin >> n;
>
>for(int i = 0; i < n; i++) {
>  double lat = 180*(rand()*1.0/RAND_MAX);
>  double lon = 180*(rand()*1.0/RAND_MAX);
>  int64_t id = rand()*1LL*rand();
>  cout << id << "," << lat << "," << lon << "\n";
>}
>
>return 0;}
>
> To import the data run:
>
> g++ filldb.cpp
> echo 10 | ./a.out > db.text
> sqlite3 test.db
> sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL);
> sqlite> .separator ,
> sqlite> .mode csv
> sqlite> .import db.text points
>
> The size of the database right after this shows 3.3M, however the "real"
> size of data is 10*(8 bytes)*3 = 2.4M and after
>
> sqlite> create index points_index on points(id);
>
> the database size more than doubles to 5.1M
>
> Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is there
> anything else to be done, in particular when the database is static (we
> know there wont be any inserts)?

A database structure has overheads and SQLite doesn't stick to only 
saving the bits that are needed (since this is never a requirement) - so 
you won't ever get the same space utilization as a flat-file might have. 
You can check exact space usage with the analyzer utility from the 
downloads page.

Some things to note: Changing the id type to INTEGER PRIMARY KEY helps 
because that makes it an alias for the rowid which is the table's own 
index of rows and is always Integer.  In other words, you remove the 
need for that column to contain its own set of data, which makes the 
space needed smaller.

Secondly, you may also use a table without a rowid, which would look 
like this:
create table points(id INT, lat REAL, lon REAL) WITHOUT ROWID;

Notice also that SQLite has no strong types, it only has affinities, so 
there is no difference between UNSIGNED INT and INT. They all will be 
able to be represented as signed 64-bit integers, or indeed anything 
else you put in such a column (unless it is the rowid alias as described 
above, which can only hold integer values).

More importantly than all the above, the Primary key is essentially an 
index on the table for id. There is no need to create another index on 
id. Creating an Index has to put all the data included in the index in a 
separate sorted B-Tree structure, which in turn will obviously eat up 
another large amount of space.

If you are this concerned with space and the data won't change, why not 
just save the data in a flatfile?
If you like the SQL way of doing things, you can read the file it into a 
SQLite TEMP table at runtime in Memory without any disk access, and work 
with it from there. 2MB or 5MB aren't exactly stellar and even the most 
basic embedded systems would have a memory capacity in excess of that. 
(And if it doesn't, then SQL isn't really an option to start with).



>
> Thank you
>
> Igor
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



[sqlite] static sqlite database

2015-06-10 Thread Igor Stassiy
Thanks for all your replies. Please let me clarify:

I need to do certain range operations efficiently and I prefer SQL
functionality, plus SQLite gives a cross platform file storage format. The
table having 3.2 megabytes is just an example. I am storing much more
(orders of 10^5 of points) however the effect is similar.

When I store several tables side by side the effect seems to be worse (I
assume SQLite leaves the possibility for inserts and hence needs to have
some space to do them efficiently, please correctly if I am wrong). SQLite
forums suggest to store tables in separate files to avoid vacuum space in
between the tables, are there any other methods to save space?

On Wed, Jun 10, 2015 at 3:51 PM Hick Gunter  wrote:

> Apart from the fact that latitude and longitude have defined ranges +-90
> and +-180 respectively, why are you choosing SQLite as the storage format
> for densely populated single table of constant  if all you want
> to do is read the values?
>
> If you are looking for a minimum space, uncompressed format you might as
> well memory-map the C structure.
>
> typedef struct point_t {
> double lat,lon;
>  };
>
> typedef struct table_t {
> uint64_t cnt;
> point_t  pnt[1];
> };
>
>
>
>
>  -Urspr?ngliche Nachricht-
>  Von: Igor Stassiy [mailto:istassiy at gmail.com]
>  Gesendet: Mittwoch, 10. Juni 2015 14:45
>  An: sqlite-users at mailinglists.sqlite.org
>  Cc: Kirill Afanasev
>  Betreff: [sqlite] static sqlite database
>
>  Hello,
>
>  I have a question about controlling the size of sqlite database
> containing records of the form id -> payload where payload has fixed size
> and id is an integer column.
>
>  I am developing a simple storage for points in an sqlite database,
> the problem is that space consumption on disk is slightly unsatisfying.
> Here is what I mean:
>
>  A program to generate some data for the database:
>
>  #include #include #include  using
> namespace std;
>
>  int main() {
>int n;
>cin >> n;
>
>for(int i = 0; i < n; i++) {
>  double lat = 180*(rand()*1.0/RAND_MAX);
>  double lon = 180*(rand()*1.0/RAND_MAX);
>  int64_t id = rand()*1LL*rand();
>  cout << id << "," << lat << "," << lon << "\n";
>}
>
>return 0;}
>
>  To import the data run:
>
>  g++ filldb.cpp
>  echo 10 | ./a.out > db.text
>  sqlite3 test.db
>  sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL);
>  sqlite> .separator , .mode csv .import db.text points
>
>  The size of the database right after this shows 3.3M, however the
> "real"
>  size of data is 10*(8 bytes)*3 = 2.4M and after
>
>  sqlite> create index points_index on points(id);
>
>  the database size more than doubles to 5.1M
>
>  Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is
> there anything else to be done, in particular when the database is static
> (we know there wont be any inserts)?
>
>  Thank you
>
>  Igor
>  ___
>  sqlite-users mailing list
>  sqlite-users at mailinglists.sqlite.org sqlite-users at mailinglists.sqlite.org>
>  http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>  ___
>   Gunter Hick
>  Software Engineer
>  Scientific Games International GmbH
>  FN 157284 a, HG Wien
>  Klitschgasse 2-4, A-1130 Vienna, Austria
>  Tel: +43 1 80100 0
>  E-Mail: hick at scigames.at
>
>  This communication (including any attachments) is intended for the
> use of the intended recipient(s) only and may contain information that is
> confidential, privileged or legally protected. Any unauthorized use or
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please immediately notify the sender
> by return e-mail message and delete all copies of the original
> communication. Thank you for your cooperation.
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] static sqlite database

2015-06-10 Thread Hick Gunter
Apart from the fact that latitude and longitude have defined ranges +-90 and 
+-180 respectively, why are you choosing SQLite as the storage format for 
densely populated single table of constant  if all you want to do is 
read the values?

If you are looking for a minimum space, uncompressed format you might as well 
memory-map the C structure.

typedef struct point_t {
double lat,lon;
 };

typedef struct table_t {
uint64_t cnt;
point_t  pnt[1];
};




 -Urspr?ngliche Nachricht-
 Von: Igor Stassiy [mailto:istassiy at gmail.com]
 Gesendet: Mittwoch, 10. Juni 2015 14:45
 An: sqlite-users at mailinglists.sqlite.org
 Cc: Kirill Afanasev
 Betreff: [sqlite] static sqlite database

 Hello,

 I have a question about controlling the size of sqlite database containing 
records of the form id -> payload where payload has fixed size and id is an 
integer column.

 I am developing a simple storage for points in an sqlite database, the 
problem is that space consumption on disk is slightly unsatisfying. Here is 
what I mean:

 A program to generate some data for the database:

 #include #include #include  using namespace 
std;

 int main() {
   int n;
   cin >> n;

   for(int i = 0; i < n; i++) {
 double lat = 180*(rand()*1.0/RAND_MAX);
 double lon = 180*(rand()*1.0/RAND_MAX);
 int64_t id = rand()*1LL*rand();
 cout << id << "," << lat << "," << lon << "\n";
   }

   return 0;}

 To import the data run:

 g++ filldb.cpp
 echo 10 | ./a.out > db.text
 sqlite3 test.db
 sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL);
 sqlite> .separator , .mode csv .import db.text points

 The size of the database right after this shows 3.3M, however the "real"
 size of data is 10*(8 bytes)*3 = 2.4M and after

 sqlite> create index points_index on points(id);

 the database size more than doubles to 5.1M

 Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is there 
anything else to be done, in particular when the database is static (we know 
there wont be any inserts)?

 Thank you

 Igor
 ___
 sqlite-users mailing list
 sqlite-users at mailinglists.sqlite.org<mailto:sqlite-users at 
mailinglists.sqlite.org>
 http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


 ___
  Gunter Hick
 Software Engineer
 Scientific Games International GmbH
 FN 157284 a, HG Wien
 Klitschgasse 2-4, A-1130 Vienna, Austria
 Tel: +43 1 80100 0
 E-Mail: hick at scigames.at

 This communication (including any attachments) is intended for the use of 
the intended recipient(s) only and may contain information that is 
confidential, privileged or legally protected. Any unauthorized use or 
dissemination of this communication is strictly prohibited. If you have 
received this communication in error, please immediately notify the sender by 
return e-mail message and delete all copies of the original communication. 
Thank you for your cooperation.





[sqlite] static sqlite database

2015-06-10 Thread Igor Stassiy
Hello,

I have a question about controlling the size of sqlite database containing
records of the form id -> payload where payload has fixed size and id is an
integer column.

I am developing a simple storage for points in an sqlite database, the
problem is that space consumption on disk is slightly unsatisfying. Here is
what I mean:

A program to generate some data for the database:

#include #include #include 
using namespace std;

int main() {
  int n;
  cin >> n;

  for(int i = 0; i < n; i++) {
double lat = 180*(rand()*1.0/RAND_MAX);
double lon = 180*(rand()*1.0/RAND_MAX);
int64_t id = rand()*1LL*rand();
cout << id << "," << lat << "," << lon << "\n";
  }

  return 0;}

To import the data run:

g++ filldb.cpp
echo 10 | ./a.out > db.text
sqlite3 test.db
sqlite> create table points(id UNSIGNED INT, lat REAL, lon REAL);
sqlite> .separator ,
sqlite> .mode csv
sqlite> .import db.text points

The size of the database right after this shows 3.3M, however the "real"
size of data is 10*(8 bytes)*3 = 2.4M and after

sqlite> create index points_index on points(id);

the database size more than doubles to 5.1M

Changing id UNSIGNED INT to INTEGER PRIMARY KEY does help, but is there
anything else to be done, in particular when the database is static (we
know there wont be any inserts)?

Thank you

Igor


[sqlite] static sqlite database

2015-06-10 Thread Richard Hipp
On 6/10/15, Igor Stassiy  wrote:
> Hello,
>
> I have a question about controlling the size of sqlite database containing
> records of the form id -> payload where payload has fixed size and id is an
> integer column.
>
> I am developing a simple storage for points in an sqlite database, the
> problem is that space consumption on disk is slightly unsatisfying.

Have you use the sqlite3_analyzer.exe utility (available for download
from https://www.sqlite.org/download.html) to see how the space in
your database file is being used?


-- 
D. Richard Hipp
drh at sqlite.org