Re: [sqlite] is blob compression worth it

2006-12-08 Thread Daniel Önnerby

Thanks for pointing out the obvious :)

Seriously though, there are times when probably all of us has made "just 
a simple database" that was not normalized in the correct way that later 
turns out to be used a lot more than intended. Normalizing the database 
at a later state requires a lot of more reprogramming and rewriting a 
lot of sql. I could see a use of this kind of functionality but the best 
way would always be to normalize.
But then again I was just curios to see if anyone had tried or thought 
about something like this before. I'm not even sure I would like this 
type of functionality implemented in SQLite


Best regards
Daniel

John Stanton wrote:
Your solution here is to normalize your database.  Third normal form 
will do it for you.


Daniel Önnerby wrote:

Just out of curiosity.
If I for instants have 1000 rows in a table with a lot of blobs and a 
lot of them have the same data in them, is there any way to make a 
plugin to sqlite that in this case would just save a reference to 
another blob if it's identical. I guess this could save a lot of 
space without any fancy decompression algorithm, and if the 
blob-field is already indexed there would be no extra time to locate 
the other identical blobs :)


Just a thought :)

John Stanton wrote:


What are you using for compression?

Have you checked that you get a useful degree of compression on that 
numeric data?  You might find that it is not particularly amenable 
to compression.


Hickey, Larry wrote:

I have a blob structure which is primarily doubles. Is there anyone 
with
some experience with doing data compression to  make the blobs 
smaller?

Tests I have
run  so far indicate that compression is too slow on blobs of a 
few  meg to
be practical. I get now at least 20 to 40 inserts per  second but 
if a single compression
takes  over a second, it's clearly not worth the trouble. Does 
anybody have experience

with a compression scheme with blobs that consist of mostly arrays of
doubles?
Some  schemes ( ibsen) offer lightening speed decompression so if the
database was primarily used  to read, this would be good choice but 
very

expensive to do
the compression required  to make it.


- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 






- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 





- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 






- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 





-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] is blob compression worth it

2006-12-08 Thread John Stanton
Your solution here is to normalize your database.  Third normal form 
will do it for you.


Daniel Önnerby wrote:

Just out of curiosity.
If I for instants have 1000 rows in a table with a lot of blobs and a 
lot of them have the same data in them, is there any way to make a 
plugin to sqlite that in this case would just save a reference to 
another blob if it's identical. I guess this could save a lot of space 
without any fancy decompression algorithm, and if the blob-field is 
already indexed there would be no extra time to locate the other 
identical blobs :)


Just a thought :)

John Stanton wrote:


What are you using for compression?

Have you checked that you get a useful degree of compression on that 
numeric data?  You might find that it is not particularly amenable to 
compression.


Hickey, Larry wrote:


I have a blob structure which is primarily doubles. Is there anyone with
some experience with doing data compression to  make the blobs smaller?
Tests I have
run  so far indicate that compression is too slow on blobs of a few  
meg to
be practical. I get now at least 20 to 40 inserts per  second but if 
a single compression
takes  over a second, it's clearly not worth the trouble. Does 
anybody have experience

with a compression scheme with blobs that consist of mostly arrays of
doubles?
Some  schemes ( ibsen) offer lightening speed decompression so if the
database was primarily used  to read, this would be good choice but very
expensive to do
the compression required  to make it.


- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 






- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 





- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 






-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] is blob compression worth it

2006-12-08 Thread Dennis Cote

Daniel Önnerby wrote:

Just out of curiosity.
If I for instants have 1000 rows in a table with a lot of blobs and a 
lot of them have the same data in them, is there any way to make a 
plugin to sqlite that in this case would just save a reference to 
another blob if it's identical. I guess this could save a lot of space 
without any fancy decompression algorithm, and if the blob-field is 
already indexed there would be no extra time to locate the other 
identical blobs :)



Daniel,

This is exactly what relational database normalization is about. If you 
have many copies of the same blob you have redundant data. The best way 
to handle that is to normalize the database by moving one copy of the 
redundant data into a separate table. Then you store the id of that 
record in the original tables where you need a reference to the data. 
For blob data you would probably want to store a hash of the blob value 
to speed comparisons, but this isn't absolutely necessary. You can 
reconstruct the original data records by joining the original tables 
with the new blob table when needed.


You can do it now without any new plugin for sqlite, and it works for 
any relational database.


Normalization like this works just as well for non blob data.

Dennis Cote



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] is blob compression worth it

2006-12-08 Thread Daniel Önnerby

Just out of curiosity.
If I for instants have 1000 rows in a table with a lot of blobs and a 
lot of them have the same data in them, is there any way to make a 
plugin to sqlite that in this case would just save a reference to 
another blob if it's identical. I guess this could save a lot of space 
without any fancy decompression algorithm, and if the blob-field is 
already indexed there would be no extra time to locate the other 
identical blobs :)


Just a thought :)

John Stanton wrote:

What are you using for compression?

Have you checked that you get a useful degree of compression on that 
numeric data?  You might find that it is not particularly amenable to 
compression.


Hickey, Larry wrote:

I have a blob structure which is primarily doubles. Is there anyone with
some experience with doing data compression to  make the blobs smaller?
Tests I have
run  so far indicate that compression is too slow on blobs of a few  
meg to
be practical. I get now at least 20 to 40 inserts per  second but if 
a single compression
takes  over a second, it's clearly not worth the trouble. Does 
anybody have experience

with a compression scheme with blobs that consist of mostly arrays of
doubles?
Some  schemes ( ibsen) offer lightening speed decompression so if the
database was primarily used  to read, this would be good choice but very
expensive to do
the compression required  to make it.


- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 






- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 





-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] is blob compression worth it

2006-12-04 Thread John Stanton

What are you using for compression?

Have you checked that you get a useful degree of compression on that 
numeric data?  You might find that it is not particularly amenable to 
compression.


Hickey, Larry wrote:

I have a blob structure which is primarily doubles. Is there anyone with
some experience with doing data compression to  make the blobs smaller?
Tests I have
run  so far indicate that compression is too slow on blobs of a few  meg to
be practical. 
I get now at least 20 to 40 inserts per  second but if a single compression
takes  over a 
second, it's clearly not worth the trouble. Does anybody have experience

with a compression scheme with blobs that consist of mostly arrays of
doubles?
Some  schemes ( ibsen) offer lightening speed decompression so if the
database was primarily used  to read, this would be good choice but very
expensive to do
the compression required  to make it.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] is blob compression worth it

2006-12-04 Thread Günter Greschenz

hi,
i've written a field-based compression using bzip2.
my experience: the fields must have at least 50 bytes, or the compressed 
data is bigger !

cu, gg

Hickey, Larry schrieb:

I have a blob structure which is primarily doubles. Is there anyone with
some experience with doing data compression to  make the blobs smaller?
Tests I have
run  so far indicate that compression is too slow on blobs of a few  meg to
be practical. 
I get now at least 20 to 40 inserts per  second but if a single compression
takes  over a 
second, it's clearly not worth the trouble. Does anybody have experience

with a compression scheme with blobs that consist of mostly arrays of
doubles?
Some  schemes ( ibsen) offer lightening speed decompression so if the
database was primarily used  to read, this would be good choice but very
expensive to do
the compression required  to make it.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-


  


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] is blob compression worth it

2006-12-04 Thread Hickey, Larry
I have a blob structure which is primarily doubles. Is there anyone with
some experience with doing data compression to  make the blobs smaller?
Tests I have
run  so far indicate that compression is too slow on blobs of a few  meg to
be practical. 
I get now at least 20 to 40 inserts per  second but if a single compression
takes  over a 
second, it's clearly not worth the trouble. Does anybody have experience
with a compression scheme with blobs that consist of mostly arrays of
doubles?
Some  schemes ( ibsen) offer lightening speed decompression so if the
database was primarily used  to read, this would be good choice but very
expensive to do
the compression required  to make it.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-