Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-04-10 Thread Mark
John Machin wrote:
> On 16/03/2009 5:48 PM, Pierre Chatelier wrote:
>>> A few hundred blocks of raw data? Blocksize approx 300K bytes?  
>>> Database
>>> created and dropped by the same process? 500 blocks is approx 150M
>>> bytes; why not keep it in a hash table in memory? If you keep it in a
>>> database or the file system, it's going to be washed through your real
>>> memory and pagefile-aka-swap-partition anyway, so just cut out the
>>> middlemen :-)
>> You're right, but who said I have only 1 DB at a time :-) ?
> 
> You didn't say anything at all about how many DBs you have, so it wasn't 
> you.
> 
> 
>> In fact, I have several DBs and I do not known in advance what size it  
>> will represent.
> 
> What is "it"?
> 
>> Perhaps 500MB. And I need RAM for other stuff, so the  
>> simplest thing is to use "normal" DBs.
> 
> You've lost me now. You need RAM for your working set of whatever you 
> are acccessing at the time, doesn't matter whether it came from a file 
> or a DB (which is just a structured file, probably not optimised for 
> 300KB BLOBs) or you built it in memory, and what's not being used at the 
> time will be in your filesystem or in your swap partition.
> 
> Please re-read what I wrote, to which your response was "You're right", 
> then consider that the total amount of data is not very relevant, what 
> matters is the size of your working set, mostly irrespective of its source.
> 
> However the overhead of packing/unpacking 300KB blobs into/out of a 
> database can't be overlooked.
> 
> I would suggest giving serious thought to a variant of an earlier 
> poster's suggestion: have the BLOBs each in its own file in the file 
> system, but mmap them.
> 
> 
>> Using memory DBs and swapping  
>> them aftwerwards would not be smooth.
>>
>> But we are not answering my initial question !
>>
>> Can I expect some gain in
>> -recompiling SQLite (which options/DEFINEs would help ?)
>> -using custom memory allocators (I am on Win32, in a multi-threaded  
>> environment, and yes, "it's bad")
>> -using compression
> 
> Compression? You tell us. What percentage compression do you get with 
> these 300KB BLOBs with (say) bz2? How long does it take to read in a 
> bz2-compressed BLOB and uncompress it compared to reading in an 
> uncompressed BLOB?
> 
> Cheers,
> John
> 

I'm doing nearly the exact same thing, except my BLOBs are about 8k, and 
they compressed to about 2k. It turned out in my testing to be 
significantly faster to skip the compression step (disk space is cheap, 
right?) and write the data directly to the database.

As to your other questions, I have no answers, and indeed, would be 
interested in them if you came up with some!

Mark

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-17 Thread Alexey Pechnikov
Hello!

On Monday 16 March 2009 09:48:50 Pierre Chatelier wrote:
> But we are not answering my initial question !
>
> Can I expect some gain in
> -recompiling SQLite (which options/DEFINEs would help ?)
> -using custom memory allocators (I am on Win32, in a multi-threaded  
> environment, and yes, "it's bad")
> -using compression

I'm have repository with tuned SQLite and zlib-compress extension and more 
other.

Debian etch: 
deb http://mobigroup.ru/debian/ etch main contrib non-free
deb-src http://mobigroup.ru/debian/ etch main contrib non-free

Debian lenny: 
deb http://mobigroup.ru/debian/ lenny main contrib non-free
deb-src http://mobigroup.ru/debian/ lenny main contrib non-free

Keyring:
sudo aptitude install debian-mobigroup-keyring

I don't have windows build but you can compile this by MinGW.

Best regards.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-16 Thread John Machin
On 16/03/2009 5:48 PM, Pierre Chatelier wrote:
>> A few hundred blocks of raw data? Blocksize approx 300K bytes?  
>> Database
>> created and dropped by the same process? 500 blocks is approx 150M
>> bytes; why not keep it in a hash table in memory? If you keep it in a
>> database or the file system, it's going to be washed through your real
>> memory and pagefile-aka-swap-partition anyway, so just cut out the
>> middlemen :-)
> 
> You're right, but who said I have only 1 DB at a time :-) ?

You didn't say anything at all about how many DBs you have, so it wasn't 
you.


> In fact, I have several DBs and I do not known in advance what size it  
> will represent.

What is "it"?

> Perhaps 500MB. And I need RAM for other stuff, so the  
> simplest thing is to use "normal" DBs.

You've lost me now. You need RAM for your working set of whatever you 
are acccessing at the time, doesn't matter whether it came from a file 
or a DB (which is just a structured file, probably not optimised for 
300KB BLOBs) or you built it in memory, and what's not being used at the 
time will be in your filesystem or in your swap partition.

Please re-read what I wrote, to which your response was "You're right", 
then consider that the total amount of data is not very relevant, what 
matters is the size of your working set, mostly irrespective of its source.

However the overhead of packing/unpacking 300KB blobs into/out of a 
database can't be overlooked.

I would suggest giving serious thought to a variant of an earlier 
poster's suggestion: have the BLOBs each in its own file in the file 
system, but mmap them.


> Using memory DBs and swapping  
> them aftwerwards would not be smooth.
> 
> But we are not answering my initial question !
> 
> Can I expect some gain in
> -recompiling SQLite (which options/DEFINEs would help ?)
> -using custom memory allocators (I am on Win32, in a multi-threaded  
> environment, and yes, "it's bad")
> -using compression

Compression? You tell us. What percentage compression do you get with 
these 300KB BLOBs with (say) bz2? How long does it take to read in a 
bz2-compressed BLOB and uncompress it compared to reading in an 
uncompressed BLOB?

Cheers,
John


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-15 Thread Pierre Chatelier
> A few hundred blocks of raw data? Blocksize approx 300K bytes?  
> Database
> created and dropped by the same process? 500 blocks is approx 150M
> bytes; why not keep it in a hash table in memory? If you keep it in a
> database or the file system, it's going to be washed through your real
> memory and pagefile-aka-swap-partition anyway, so just cut out the
> middlemen :-)

You're right, but who said I have only 1 DB at a time :-) ?
In fact, I have several DBs and I do not known in advance what size it  
will represent. Perhaps 500MB. And I need RAM for other stuff, so the  
simplest thing is to use "normal" DBs. Using memory DBs and swapping  
them aftwerwards would not be smooth.

But we are not answering my initial question !

Can I expect some gain in
-recompiling SQLite (which options/DEFINEs would help ?)
-using custom memory allocators (I am on Win32, in a multi-threaded  
environment, and yes, "it's bad")
-using compression

Regards,

Pierre Chatelier

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-14 Thread John Machin
On 13/03/2009 11:24 PM, Mike Eggleston wrote:
> On Fri, 13 Mar 2009, Pierre Chatelier might have said:
> 
>> Hello,
>>
>> I am using SQLITE to store and retrieve raw data blocks that are  
>> basically ~300Ko. Each block has an int identifier, so that insert/ 
>> select are easy. This is a very basic use : I do not use complex  
>> queries. Only "INSERT/SELECT where index=..."
>>
>> Now, I am thinking about performance, for writing a sequence of a few  
>> hundreds 300k blocks, as fast as possible.
>> Obviously, I use bind_blob(), blob_read() and blob_write() functions.  
>> I have already tuned the PRAGMAs for journal/synchronous/page_size/ 
>> cache, so that it's rather efficient.
>> I do not DELETE any content and the whole database is dropped after  
>> use: VACUUM is not important.
>>
>> There are other ways to optimize, but I wonder if it is worth, or it  
>> the gain would be only marginal regarding what I am doing.
>> 1)recompile SQLite ? Which compile options would help in this case ?
>> 2)using other memory allocators ? I am not sure that writing big data  
>> blocks triggers many calls to malloc()
>> 3)using compression ? zlib could help, but since my data does not  
>> compress very well (Let's say an average 20% space can be saved per  
>> block), I am not sure that the compression time will balance the  
>> writing time.
>>
>> Of course, I am only asking for advices regarding your experience,  
>> there is certainly no exact answer, and it will always depend on my  
>> data.
>>
>> Regards,
>>
>> Pierre Chatelier
> 
> Why do you not use the int converted to a hex (sprintf("%08x", id))
> as a file name and just use the file system?
> 

A few hundred blocks of raw data? Blocksize approx 300K bytes? Database 
created and dropped by the same process? 500 blocks is approx 150M 
bytes; why not keep it in a hash table in memory? If you keep it in a 
database or the file system, it's going to be washed through your real 
memory and pagefile-aka-swap-partition anyway, so just cut out the 
middlemen :-)
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-14 Thread Mike Eggleston
On Fri, 13 Mar 2009, Pierre Chatelier might have said:

> Hello,
> 
> I am using SQLITE to store and retrieve raw data blocks that are  
> basically ~300Ko. Each block has an int identifier, so that insert/ 
> select are easy. This is a very basic use : I do not use complex  
> queries. Only "INSERT/SELECT where index=..."
> 
> Now, I am thinking about performance, for writing a sequence of a few  
> hundreds 300k blocks, as fast as possible.
> Obviously, I use bind_blob(), blob_read() and blob_write() functions.  
> I have already tuned the PRAGMAs for journal/synchronous/page_size/ 
> cache, so that it's rather efficient.
> I do not DELETE any content and the whole database is dropped after  
> use: VACUUM is not important.
> 
> There are other ways to optimize, but I wonder if it is worth, or it  
> the gain would be only marginal regarding what I am doing.
> 1)recompile SQLite ? Which compile options would help in this case ?
> 2)using other memory allocators ? I am not sure that writing big data  
> blocks triggers many calls to malloc()
> 3)using compression ? zlib could help, but since my data does not  
> compress very well (Let's say an average 20% space can be saved per  
> block), I am not sure that the compression time will balance the  
> writing time.
> 
> Of course, I am only asking for advices regarding your experience,  
> there is certainly no exact answer, and it will always depend on my  
> data.
> 
> Regards,
> 
> Pierre Chatelier

Why do you not use the int converted to a hex (sprintf("%08x", id))
as a file name and just use the file system?

Mike
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-13 Thread Pierre Chatelier
Hello,

> Have you considered storing the blob data into a file?
> Just keep a record of the blob in the sqlite.db if you store all the  
> blobs in a single file then you'll need a filename, begin, length.  
> Or if one blob per file just the filename..
> Just an idea. Not really sure which would be faster
I do not do that, because I will save the db for querying it later,  
and it is far simpler for archiving to manage one file only.

Regards,

Pierre Chatelier

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-13 Thread Ken

Pierre,

Have you considered storing the blob data into a file?

Just keep a record of the blob in the sqlite.db if you store all the blobs in a 
single file then you'll need a filename, begin, length. Or if one blob per file 
just the filename..

Just an idea. Not really sure which would be faster.


--- On Fri, 3/13/09, Pierre Chatelier  wrote:

> From: Pierre Chatelier 
> Subject: [sqlite] Advices to get max performance with SQLITE and BLOBS
> To: sqlite-users@sqlite.org
> Date: Friday, March 13, 2009, 3:49 AM
> Hello,
> 
> I am using SQLITE to store and retrieve raw data blocks
> that are  
> basically ~300Ko. Each block has an int identifier, so that
> insert/ 
> select are easy. This is a very basic use : I do not use
> complex  
> queries. Only "INSERT/SELECT where index=..."
> 
> Now, I am thinking about performance, for writing a
> sequence of a few  
> hundreds 300k blocks, as fast as possible.
> Obviously, I use bind_blob(), blob_read() and blob_write()
> functions.  
> I have already tuned the PRAGMAs for
> journal/synchronous/page_size/ 
> cache, so that it's rather efficient.
> I do not DELETE any content and the whole database is
> dropped after  
> use: VACUUM is not important.
> 
> There are other ways to optimize, but I wonder if it is
> worth, or it  
> the gain would be only marginal regarding what I am doing.
> 1)recompile SQLite ? Which compile options would help in
> this case ?
> 2)using other memory allocators ? I am not sure that
> writing big data  
> blocks triggers many calls to malloc()
> 3)using compression ? zlib could help, but since my data
> does not  
> compress very well (Let's say an average 20% space can
> be saved per  
> block), I am not sure that the compression time will
> balance the  
> writing time.
> 
> Of course, I am only asking for advices regarding your
> experience,  
> there is certainly no exact answer, and it will always
> depend on my  
> data.
> 
> Regards,
> 
> Pierre Chatelier
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Advices to get max performance with SQLITE and BLOBS

2009-03-13 Thread Pierre Chatelier
Hello,

I am using SQLITE to store and retrieve raw data blocks that are  
basically ~300Ko. Each block has an int identifier, so that insert/ 
select are easy. This is a very basic use : I do not use complex  
queries. Only "INSERT/SELECT where index=..."

Now, I am thinking about performance, for writing a sequence of a few  
hundreds 300k blocks, as fast as possible.
Obviously, I use bind_blob(), blob_read() and blob_write() functions.  
I have already tuned the PRAGMAs for journal/synchronous/page_size/ 
cache, so that it's rather efficient.
I do not DELETE any content and the whole database is dropped after  
use: VACUUM is not important.

There are other ways to optimize, but I wonder if it is worth, or it  
the gain would be only marginal regarding what I am doing.
1)recompile SQLite ? Which compile options would help in this case ?
2)using other memory allocators ? I am not sure that writing big data  
blocks triggers many calls to malloc()
3)using compression ? zlib could help, but since my data does not  
compress very well (Let's say an average 20% space can be saved per  
block), I am not sure that the compression time will balance the  
writing time.

Of course, I am only asking for advices regarding your experience,  
there is certainly no exact answer, and it will always depend on my  
data.

Regards,

Pierre Chatelier

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users