Simon,

No, we knew about normalisation, the database is normalised, that part of the design hasn't changed in years.

The reasons for the massive reduction in database size is that we worked out how to handle repeating lines of data that change just enough that we thought they needed to be kept. With some small tweaks we could eliminate data that we thought we needed to preserve. We had assumed that we needed to keep everything, but by going back in, looking at what actual data we now had, we worked out we could store a small amount of extra information that stores some state change but this meant we could dump most of the database.

It was something that we hadn't appreciated a few years ago, and the ever increasing database meant we had to do something about the DB size.

Its taken a few months of work, mainly in a dark room with a wet towel on our foreheads, to do the tiny changes necessary to make a big difference.

In hindsight the fact that SQLite is so easy and fast to use was a slight disadvantage to us, it allowed us to be a bit quick and dirty with designs, when we should have thought through some of the issues. However thats what startups are like ;)

I think all the explanations to date have been helpful and appreciate the time take to answer, we're going to think a little more carefully about how we manage our database on a VM. I'm in the process of moving home so the only real hardware (of any note) is sitting in storage so the only testing we can do is either virtualised or on Mac laptops. Neither of which will help us in this instance.

Rob

On 17 Jul 2018, at 13:16, Simon Slavin wrote:

On 17 Jul 2018, at 1:10pm, R Smith <ryansmit...@gmail.com> wrote:

What kind of data did you store (maybe some examples if possible) that you could condense it by ~99% like that?

I think that the OP's organisation discovered the 'relational' part of RDBMS and implemented normalisation.

To Rob Willett: Ryan Smith's explanation is better than mine. Please read his post and ignore my wooly one. I tried three ways to get across the sequential-access vs. random-access point and wasn't really happy with anything I wrote.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to