If you have repetition like that it doesn't mean that you cannot
normalize into third normal form. In your simple case having a table
where X="E:\DirectX90\Feb2006" would work. It would perform better than
brute force compression requiring inflation for each search.
Gussimulator wrote:
Hello there,
This is what I mean by repetitive data:
Tables:
E:\DirectX90c\
E:\DirectX90c\Feb2006_MDX1_x86_Archive.cab\
E:\DirectX90c\Feb2006_d3dx9_29_x64.cab\
E:\DirectX90c\Feb2006_xact_x64.cab\
E:\DirectX90c\Feb2006_MDX1_x86.cab\
E:\DirectX90c\Feb2006_xact_x86.cab\
And so on, As you can see, the string E:\DirectX90c\ repeats all the
time in this example. (Also does "Feb2006_" on almost every table).
It's just an example of the type of repetitive data I have to deal with,
they are normally paths. Since theres directories within directories,
the paths repeat.
What would be an ideal aproach for this situation?, I would like to save
space, but I wouldnt like to waste a big amount of processing power to
do so.
One must keep in mind that my system must perform "well" on various
situations (which I cant predict, at least not all of them), for this
reason I cant have a very elaborated database scheme. Sometimes saving a
few KBs could mean wasting a few tons of cycles, and I can't deal with
that. I'd rather have those extra KBs and deal with a responsive
application, than saving a few KBs and falling asleep at the keyboard
(don't worry, it's a multi-threaded environment, however it's important
to keep it optimized, I'm just over-sizing the problem a little).
I'd like to take the right 'path' here...
Thanks.
----- Original Message ----- From: "Darren Duncan"
<[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Thursday, July 06, 2006 12:04 AM
Subject: Re: [sqlite] Compressing the DBs?
At 6:04 PM -0300 7/5/06, Gussimulator wrote:
Now, since theres a lot of repetitive data, I thought that
compressing the database would be a good idea, since, we all know..
One of the first principles of data compression is getting rid of
repetitive data, so... I was wondering if this is possible with
SQLite or it would be quite a pain to implement a compression scheme
by myself?.. I have worked with many compression libraries before so
that wouldnt be an issue, the issue however, would be to implement
any of the libraries into SQLite...
First things first, what do you mean by "repetitive"?
Do you mean that there are many copies of the same data?
Perhaps a better approach is to normalize the database and just store
single copies of things.
If you have tables with duplicate rows, then add a 'quantity' column
and reduce to one copy of the actual data.
If some columns are unique and some are repeated, perhaps try
splitting the tables into more tables that are related.
This, really, is what you should be doing first, and may very well be
the only step you need.
If you can't do that, then please explain in what way the data is
repetitive?
-- Darren Duncan