Re: [sqlite] Database size bigger than before deleting records

2012-12-03 Thread Henry Huang
Hi Jay,

No, auto_vacuuming is off. And yes, the rows are gone, at least i couldn't
find them anymore in those tables.

My database file is more than 50MB, and I only deleted records in 3 tables.
And indeed, there are lots of indexes inside, so maybe you're right.



On Sun, Dec 2, 2012 at 10:37 PM, Jay A. Kreibich  wrote:

> On Sun, Dec 02, 2012 at 09:39:23PM +0100, Henry Huang scratched on the
> wall:
> > Good day everyone,
> >
> > I had a database file, and I deleted many records (tens of thousands)
> from
> > three tables, then, I did a VACUUM to that database. After I checked the
> > size of the database file, I was a bit surprised that it's even (although
> > slightly) bigger than before deletions.
> >
> > Is that possible?
>
>   It is not all that unusual for a VACUUM to result in a slightly
>   larger database, especially with a DB that has a fair number of
>   indexes.  This is usually not the case when rows have been deleted
>   from the database, however.
>
>   Not to ask the obvious, but are you sure the rows were actually
>   deleted?  Was auto-vacuuming on?
>
>-j
>
> --
> Jay A. Kreibich < J A Y  @  K R E I B I.C H >
>
> "Intelligence is like underwear: it is important that you have it,
>  but showing it to the wrong people has the tendency to make them
>  feel uncomfortable." -- Angela Johnson
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size bigger than before deleting records

2012-12-02 Thread Jay A. Kreibich
On Sun, Dec 02, 2012 at 09:39:23PM +0100, Henry Huang scratched on the wall:
> Good day everyone,
> 
> I had a database file, and I deleted many records (tens of thousands) from
> three tables, then, I did a VACUUM to that database. After I checked the
> size of the database file, I was a bit surprised that it's even (although
> slightly) bigger than before deletions.
> 
> Is that possible?

  It is not all that unusual for a VACUUM to result in a slightly
  larger database, especially with a DB that has a fair number of
  indexes.  This is usually not the case when rows have been deleted
  from the database, however.

  Not to ask the obvious, but are you sure the rows were actually
  deleted?  Was auto-vacuuming on?

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2012-09-02 Thread Simon Slavin

On 2 Sep 2012, at 8:15am, Keith Medcalf  wrote:

> On Saturday, 01 September, 2012, at 20:28 Ted Rolle, Jr. wrote:
> 
>> Remember Y2K?  That was caused by a three-letter blue company.  They
>> wanted to save 1 (one!) byte by not storing the century in critical
>> operating system fields.  The comments were (1960s) "Well, we won't be
>> around to fix it...wink, wink, nudge, nudge."  I was.  Most companies
>> got through it with few problems --- a tribute to the programming staff.

Only slightly true.  IBM was supplying hardware but (Global Services) wasn't so 
big at the time, a lot of it was Arthur Andersen and EDS and they were both 
terrible.  Also it's worth remembering that after all the fuss absolutely 
nothing terrible happened when y2k rolled around.  Everyone had been 
appropriately alert and diligent and almost all problems had been taken care of 
in time.  A couple of things slipped through, none of which (as far as I know) 
caused a single death.

> Insistence on using two-digit years survived up to the early 90's, and the 
> Operating System wasn't the problem really the problem.  The real problem was 
> applications and data being and manipulated and stored "century-free", by 
> people who ought to have known better.  This still persists today with people 
> who insist in recording and displaying dates in formats such as 06/07/02.  
> Does this mean 06 July xx02; June 7, xx02; or 2 July xx06 (with 
> guess-the-century).

Take it from someone who was there at the time.  By 1990 companies were 
cynically leaving out century numbers in the hope that they'd be able to charge 
somebody some money somehow when it became necessary to make their code y2k 
safe.  I raised a stink when a terrible well-known programming company left out 
centuries in a job they were starting in 1989 for an international bank.  They 
backed down only when I reminded them my side had final wording on the contract 
and would write in whatever penalty clauses I thought appropriate.

I did see a company put century numbers in.  For a customer with whom they had 
a twenty year contract.

At the time we had 40 Meg disks on mundane home PCs, and 250 Meg disks on 
servers for small businesses.  You could put together multi-gigabyte disk 
arrays for two grand a gig.  There really wasn't any reason to save two bytes 
per date.

> The next most disgusting behaviour is failure (or rather refusal) to display 
> timezones. Extremely annoying are those folks who store datetime in localized 
> time rather than storing in UT1 and converting on input/output.

Annoying now.  Back then, converting to local time took a lot of CPU time.  
Doing the conversions necessary for showing a screenfull (probably 20 lines) of 
transactions with one date each involved a significant slowdown.  In fact many 
formats for data storage were governed not by what format was fastest for 
maths, but by the format wanted for displaying data on the screen.  All 
significant data processing was done in an overnight batch run anyway, so 
processing speed for maths wasn't as important.  I think almost all the systems 
I was aware of stored dates as MMDDHHMMSS in the local time of the client's 
HQ.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2012-09-02 Thread Keith Medcalf
On Saturday, 01 September, 2012, at 20:28 Ted Rolle, Jr. wrote:

> This is so true!

> Remember Y2K?  That was caused by a three-letter blue company.  They
> wanted to save 1 (one!) byte by not storing the century in critical
> operating system fields.  The comments were (1960s) "Well, we won't be
> around to fix it...wink, wink, nudge, nudge."  I was.  Most companies
> got through it with few problems --- a tribute to the programming staff.

Insistence on using two-digit years survived up to the early 90's, and the 
Operating System wasn't the problem really the problem.  The real problem was 
applications and data being and manipulated and stored "century-free", by 
people who ought to have known better.  This still persists today with people 
who insist in recording and displaying dates in formats such as 06/07/02.  Does 
this mean 06 July xx02; June 7, xx02; or 2 July xx06 (with guess-the-century).

The next most disgusting behaviour is failure (or rather refusal) to display 
timezones.  Extremely annoying are those folks who store datetime in localized 
time rather than storing in UT1 and converting on input/output.

 
> On 09/01/2012 07:51 PM, Simon Slavin wrote:

>> On 1 Sep 2012, at 11:34pm, "Ted Rolle, Jr."  wrote:

>>> Back in the olden days we predicted a database's storage to be about 5
>>> times the size of the data.
>>> By 'olden' I mean IBM's IMS, VSAM, DB2. ..., 70s, 80s.

>> Back in the old days you had 72 or 80 columns to a punched card, and any
>> columns you didn't use were wasted.  You could double your database
>> capacity/speed/cost by saving one bit per record.

---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2012-09-01 Thread Ted Rolle, Jr.
This is so good to hear!  The advances in database theory and practice
have put the old ideas to rest.

Hooray for today!

Ted

On 09/01/2012 07:08 PM, Richard Hipp wrote:
> On Sat, Sep 1, 2012 at 6:34 PM, Ted Rolle, Jr.  wrote:
>
>> Back in the olden days we predicted a database's storage to be about 5
>> times the size of the data.
>> By 'olden' I mean IBM's IMS, VSAM, DB2. ..., 70s, 80s.
>> I hope this is still not the case...
>>
> A lot depends on your data, of course.
>
> But the Fossil  repository (an SQLite database)
> that holds the complete 12.5 year revision history of SQLite is about 69.4%
> efficient at holding data overall (meaning that the content held is about
> 69.4% of the total database size, and about 82.8% efficient if you exclude
> indices.  That is a lot better than your 20% rule-of-thumb.  On the other
> hand, you can make the overall storage efficiency as small as you want by
> creating enough useless indices...
>
> You can measure the storage efficiency of your on SQLite databases using
> sqlite3_analyzer.exe binary available on the download
> page
> .
>
>
>> Ted
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2012-09-01 Thread Ted Rolle, Jr.
This is so true!

Remember Y2K?  That was caused by a three-letter blue company.  They
wanted to save 1 (one!) byte by not storing the century in critical
operating system fields.  The comments were (1960s) "Well, we won't be
around to fix it...wink, wink, nudge, nudge."  I was.  Most companies
got through it with few problems --- a tribute to the programming staff.

Ted

On 09/01/2012 07:51 PM, Simon Slavin wrote:
> On 1 Sep 2012, at 11:34pm, "Ted Rolle, Jr."  wrote:
>
>> Back in the olden days we predicted a database's storage to be about 5
>> times the size of the data.
>> By 'olden' I mean IBM's IMS, VSAM, DB2. ..., 70s, 80s.
> Back in the old days you had 72 or 80 columns to a punched card, and any 
> columns you didn't use were wasted.  You could double your database 
> capacity/speed/cost by saving one bit per record.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2012-09-01 Thread Simon Slavin

On 1 Sep 2012, at 11:34pm, "Ted Rolle, Jr."  wrote:

> Back in the olden days we predicted a database's storage to be about 5
> times the size of the data.
> By 'olden' I mean IBM's IMS, VSAM, DB2. ..., 70s, 80s.

Back in the old days you had 72 or 80 columns to a punched card, and any 
columns you didn't use were wasted.  You could double your database 
capacity/speed/cost by saving one bit per record.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2012-09-01 Thread Richard Hipp
On Sat, Sep 1, 2012 at 6:34 PM, Ted Rolle, Jr.  wrote:

> Back in the olden days we predicted a database's storage to be about 5
> times the size of the data.
> By 'olden' I mean IBM's IMS, VSAM, DB2. ..., 70s, 80s.
> I hope this is still not the case...
>

A lot depends on your data, of course.

But the Fossil  repository (an SQLite database)
that holds the complete 12.5 year revision history of SQLite is about 69.4%
efficient at holding data overall (meaning that the content held is about
69.4% of the total database size, and about 82.8% efficient if you exclude
indices.  That is a lot better than your 20% rule-of-thumb.  On the other
hand, you can make the overall storage efficiency as small as you want by
creating enough useless indices...

You can measure the storage efficiency of your on SQLite databases using
sqlite3_analyzer.exe binary available on the download
page
.


>
> Ted
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database size (again)

2010-12-07 Thread Max Vlasov
On Tue, Dec 7, 2010 at 6:57 PM, Laszlo Nemeth wrote:

> Hi,
>
> Question: is it possible to recompile sqlite to force the
> representation of integers to be 4 bytes, and that of floats to be
> also 4 bytes. I would like to have no observable change in the
> behaviour of sqlite.
>
>

Since integers are effectively packed in sqlite and I suppose you can not
map you tick to rowid (I guessed it from the name), the only possible
improvement could be floats. You could borrow this idea:
http://stackoverflow.com/questions/2775854/map-a-32-bit-float-to-a-32-bit-integerto
map it in your reading/writing code to save space for floats. In worst
case every 8 bytes float will be 4(5)-byte integer (5 possible due to the
internal packed format of sqlite), so maybe you will save 10-15 bytes per
record (I added also indexes)

Max Vlasov
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database size (again)

2010-12-07 Thread Jim Wilcoxson
A lot of the SQLite overhead is going to be in the stuff surrounding your
actual data; I'd be surprised if you saved much space by using fixed-size
ints vs the varints used by SQLite.  You didn't mention about indexes; if
you have any, they will take a lot of space because your row size is so
small.

Maybe write your own VFS for SQLite?

Definitely run sqlite3_analyzer before deciding anything.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://sites.google.com/site/hashbackup


On Tue, Dec 7, 2010 at 10:57 AM, Laszlo Nemeth wrote:

> Hi,
>
> I have a database (6 in fact) of high-frequency data
>
> create table eurusd (tick integer not null, bid float not null, ask
> float not null);
>
> with 80M records currently and growing, freshly inserted, no deletions
> will ever take place, the schema will never change, and neither of the
> fields can be null. The size is already 3.6G (I put an index on it
> after bulk insert), which is a bit too much for me considering that
> the existing infrastructure (ie binary file 12bytes per record) is
> 800M (which fits into memory and I don't even need to index).
>
> Having checked older posts on sqlite-users, I noticed that this issue
> comes up frequently and normally the problem is either wrong choice of
> datatypes (ie text instead of integer), or unnecessary indices. None
> of which applies here.
>
> Question: is it possible to recompile sqlite to force the
> representation of integers to be 4 bytes, and that of floats to be
> also 4 bytes. I would like to have no observable change in the
> behaviour of sqlite.
>
> I
> (1) am quite comfortable with the hacking,
> (2) understand that the database will no longer be platform
> independent, nor compatible with anything else,
> (3) tried to run the analyser to see if there is something fishy with
> half full pages, but it wants tcl8.6 and haven't gotten around to
> install it (will do),
> (4) also checked the file format document, but that didn't give me any
> immediate hint how to achieve what  I'd like, though I only skimmed it
> through.
>
> The point of doing this is that I get a smaller db, and I still get
> all the beautiful machinery built for sqlite.
>
> Any suggestions, a complete solution, or  a "no  that's not possible,
> because..." will be much appreciated.
> Thanks, Z
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database size

2004-12-02 Thread D. Richard Hipp
amead wrote:
[W]hen I browse the freshly created database (using xxd), I see 
long stings of zero bytes:

It looks like SQLite pads records with zero bytes?  Is this necessary?  
Are there any options I can use to make the database use less disk space?

Links that might help:
http://www.sqlite.org/sqlite_analyzer-3.0.5.bin.gz
  Download and run this program against your database
  to see what the space is being used for.
http://www.sqlite.org/cvstrac/getfile/sqlite/src/btree.c
  The header comment on the btree.c source file explains
  the file format and the origin of the zeros that you
  are concerned about.
http://www.sqlite.org/php2004/page-001.html
  Slides from a talk on the architecture of SQLite.

--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565