Hello there,
This is what I mean by repetitive data:
Tables:
E:\DirectX90c\
E:\DirectX90c\Feb2006_MDX1_x86_Archive.cab\
E:\DirectX90c\Feb2006_d3dx9_29_x64.cab\
E:\DirectX90c\Feb2006_xact_x64.cab\
E:\DirectX90c\Feb2006_MDX1_x86.cab\
E:\DirectX90c\Feb2006_xact_x86.cab\
And so on, As you can see, the string E:\DirectX90c\ repeats all the time in
this example. (Also does "Feb2006_" on almost every table).
It's just an example of the type of repetitive data I have to deal with,
they are normally paths. Since theres directories within directories, the
paths repeat.
What would be an ideal aproach for this situation?, I would like to save
space, but I wouldnt like to waste a big amount of processing power to do
so.
One must keep in mind that my system must perform "well" on various
situations (which I cant predict, at least not all of them), for this reason
I cant have a very elaborated database scheme. Sometimes saving a few KBs
could mean wasting a few tons of cycles, and I can't deal with that. I'd
rather have those extra KBs and deal with a responsive application, than
saving a few KBs and falling asleep at the keyboard (don't worry, it's a
multi-threaded environment, however it's important to keep it optimized, I'm
just over-sizing the problem a little).
I'd like to take the right 'path' here...
Thanks.
----- Original Message -----
From: "Darren Duncan" <[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Thursday, July 06, 2006 12:04 AM
Subject: Re: [sqlite] Compressing the DBs?
At 6:04 PM -0300 7/5/06, Gussimulator wrote:
Now, since theres a lot of repetitive data, I thought that compressing the
database would be a good idea, since, we all know.. One of the first
principles of data compression is getting rid of repetitive data, so... I
was wondering if this is possible with SQLite or it would be quite a pain
to implement a compression scheme by myself?.. I have worked with many
compression libraries before so that wouldnt be an issue, the issue
however, would be to implement any of the libraries into SQLite...
First things first, what do you mean by "repetitive"?
Do you mean that there are many copies of the same data?
Perhaps a better approach is to normalize the database and just store
single copies of things.
If you have tables with duplicate rows, then add a 'quantity' column and
reduce to one copy of the actual data.
If some columns are unique and some are repeated, perhaps try splitting
the tables into more tables that are related.
This, really, is what you should be doing first, and may very well be the
only step you need.
If you can't do that, then please explain in what way the data is
repetitive?
-- Darren Duncan