At 07:07 06/07/2006, you wrote:
Hello there,

This is what I mean by repetitive data:

Tables:
E:\DirectX90c\
E:\DirectX90c\Feb2006_MDX1_x86_Archive.cab\
E:\DirectX90c\Feb2006_d3dx9_29_x64.cab\
E:\DirectX90c\Feb2006_xact_x64.cab\
E:\DirectX90c\Feb2006_MDX1_x86.cab\
E:\DirectX90c\Feb2006_xact_x86.cab\

And so on, As you can see, the string E:\DirectX90c\ repeats all the time in this example. (Also does "Feb2006_" on almost every table).

It's just an example of the type of repetitive data I have to deal with, they are normally paths. Since theres directories within directories, the paths repeat.

What would be an ideal aproach for this situation?, I would like to save space, but I wouldnt like to waste a big amount of processing power to do so.

One must keep in mind that my system must perform "well" on various situations (which I cant predict, at least not all of them), for this reason I cant have a very elaborated database scheme. Sometimes saving a few KBs could mean wasting a few tons of cycles, and I can't deal with that. I'd rather have those extra KBs and deal with a responsive application, than saving a few KBs and falling asleep at the keyboard (don't worry, it's a multi-threaded environment, however it's important to keep it optimized, I'm just over-sizing the problem a little).


I'd like to take the right 'path' here...
Thanks.

.... SQLite has no compression system for free. Also, any compression must be done on page level, not data level, because most compression algorithms uses past data statics (statistical data of past data) for compression and if you try to it on data level, when a row is eliminated all rows after it becomes garbage.

There are a lot of compression algorithms, but i think the best for this is an arith or range coder with order 0 or 1, the page size is too low for greater orders or lz algorithms. Both (arith and range) are pretty fast, no fpu code (only integer, for embedded devices) and i think it will not slow too much file i/o. On text only data you can expect 2.5 bpb or near 65% of size reduction, more when page size is greater.

The code is about 10 Kb, but i don't know where "plug-it-in" ;)

HTH

Reply via email to