Hello Renaud, Thursday, November 8, 2007, 9:11:41 AM, you wrote:
RH> Is the mmap option part of sqlite or do we need to change the sqlite code ? RH> If so, has somebody here already tried it ? Any source available ? RH> What 's the order of improvement ? RH> Renaud >> Date: Thu, 8 Nov 2007 15:54:35 +0500 >> From: [EMAIL PROTECTED] >> To: sqlite-users@sqlite.org >> Subject: Re: [sqlite] Disk caching impacts performance. >> >> Try this, use mmap >> (i assume u'r db is readonly)this is much faster and >> better than ifstream read. Also ifstream read can keep the data in cache as >> long as no other serious >> i/o occurs. >> >> U need to accept it as we work with Virtual Mem or >> write your own FileSystem which is mem based and short circuits os calls. >> Sqlite 3.5x has good support for such ext. >> >> regrads >> ragha >> >> ****************************************************************************************** >> This email and its attachments contain confidential information from HUAWEI, >> which is intended only for the person or entity whose address is listed >> above. Any use of the information contained herein in any way (including, >> but not limited to, total or partial disclosure, reproduction, or >> dissemination) by persons other than the intended recipient(s) is >> prohibited. If you receive this e-mail in error, please notify the sender by >> phone or email immediately and delete it! >> ***************************************************************************************** >> >> ----- Original Message ----- >> From: Julien Renggli >> Date: Thursday, November 8, 2007 4:15 pm >> Subject: [sqlite] Disk caching impacts performance. >> >>> Hello, >>> >>> >>> >>> I'm currently working on a project which is very likely to use >>> SQLite as >>> data storage. Since performance is important for us, I already >>> found on >>> the SQLite website many ways to optimise the code (always working in >>> transactions where possible, using a page size of 4096 since it's >>> running on the Windows platform, using integers primary keys, ...). >>> ButI have one problem that I "solved" in an unorthodox way; it >>> works, but >>> maybe you have a better solution than mine? >>> >>> >>> >>> I've been doing some test with a ~100 MB database, in which I have >>> three >>> tables: one for structured data (Objects, 2000 entries), one for the >>> blobs we have to store (ObjectBlobs ID INTEGER PRIMARY KEY, Data BLOB) >>> (8000 entries), and one which binds the structured data and the blobs >>> (8000 entries). As you can imagine, each Object has 4 blobs linked to >>> it; the blobs can be quite large (let's say up to 1 MB). >>> >>> My (C++) application just has to read the table "Objects", and one of >>> the Blobs for each Object. >>> >>> >>> >>> Now the first time I run my application, it takes quite a long time >>> >>> (30s) to load the blobs. But if I re-run the app, it only takes 1s to >>> load them. It's clearly a disk caching issue: if I copy huge files to >>> the disk between two runs, it takes again 30s to load the blobs (i.e. >>> >>> the DB is no more in the disk cache). Profiling the application >>> indicates sqlite::winRead() is the bottleneck. >>> >>> >>> >>> I then had the following idea: SQLite is probably reading the file >>> randomly, depending on where the data lies. If I can force the DB >>> to be >>> cached, everything should be fine. So before connecting the >>> database, I >>> first read it sequentially (using a C++ ifstream) until the end of >>> file. >>> It perfectly solves the database problem, even though I still >>> notice a >>> difference (3s to read the file on 1st run, 0.2s later). But 3s is OK >>> where 30s was worrying me. >>> >>> >>> >>> I hope I explain the situation clear enough, and ask you now: is it >>> theonly way to do it? I find the trick a bit nasty and don't like >>> it; maybe >>> I missed something? Before you ask: I tried to VACUUM the DB, it only >>> reduced the timings to 16s, which was still bad for our requirements. >>> >>> Tests with a larger DB (it can get much bigger than my example) and on >>> different machines tend to confirm my theory. >>> >>> >>> >>> Thanks in advance (and a big thank for SQLite which is really nice and >>> easy to use !), >>> >>> >>> >>> Julien Renggli >>> >>> >>> >>> P.S.: Some technical informations: >>> >>> - sqlite v.3.3.16 (we will upgrade to the latest version later), C API >>> >>> - Windows XP SP2 >>> >>> - Timings on Pentium 4 3.4GHz, 2GB RAM >>> >>> >>> >>> >> >> ----------------------------------------------------------------------------- >> To unsubscribe, send email to [EMAIL PROTECTED] >> ----------------------------------------------------------------------------- >> RH> _________________________________________________________________ RH> Explore the seven wonders of the world RH> http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE RH> ----------------------------------------------------------------------------- RH> To unsubscribe, send email to [EMAIL PROTECTED] RH> ----------------------------------------------------------------------------- "CreateFileMapping" is I believe the function in windows. You have to "CreateFile" to open the file, then map it. Then you can read forward in it like a pointer to a block of memory. I actually have a "Preload" function I use sometimes which reads the first 5-50 megs of the file to pre-cache it. -- Best regards, Teg mailto:[EMAIL PROTECTED] ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------