Re: [sqlite] Reserve database pages
On Aug 13, 2010, at 8:41 PM, Max Vlasov wrote: > On Fri, Aug 13, 2010 at 1:38 PM, Max Vlasov > wrote: > >> >> I can approximately calculate, how big the new database will grow. Is >>> there a way to tell SQLite to reserve an inital space or numer of >>> pages >>> instead of letting the database file grow again and again? >>> >> >> >> Thought about this recently. Another idea is to tweak VFS. Since >> xWrite >> method is supposed to accept iOfst that is bigger than the current >> file >> size, one can check whether the new write query is going to >> allocate new >> space for the file (vs internal space writing), and before actual >> call make >> prior call of the same function writing for example a single zero >> byte a >> little far and after that perform the original request. The fossil tip at present supports the SQLITE_FCNTL_CHUNK_SIZE argument to sqlite3_file_control(). To allocate space in 1MB chunks: void setOneMBChunkSize(sqlite3 *db){ int szChunk = 1024*1024; sqlite3_file_control(db, "main", SQLITE_FCNTL_CHUNK_SIZE, (void*)&szChunk); } From that point on, connection "db" extends and truncates the db file in 1MB chunks. Works on unix and win32. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
On Fri, Aug 13, 2010 at 1:38 PM, Max Vlasov wrote: > > I can approximately calculate, how big the new database will grow. Is >> there a way to tell SQLite to reserve an inital space or numer of pages >> instead of letting the database file grow again and again? >> > > > Thought about this recently. Another idea is to tweak VFS. Since xWrite > method is supposed to accept iOfst that is bigger than the current file > size, one can check whether the new write query is going to allocate new > space for the file (vs internal space writing), and before actual call make > prior call of the same function writing for example a single zero byte a > little far and after that perform the original request. Took not so much time to try it, the good news is that the tweak technically works (thanks to the flexible VFS sqlite uses), the bad news is that it gives no significant improvement, at least on Windows XP, both FAT32 and NTFS, only in some specific cases a little difference was noticeable. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
> I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? > Thought about this recently. Another idea is to tweak VFS. Since xWrite method is supposed to accept iOfst that is bigger than the current file size, one can check whether the new write query is going to allocate new space for the file (vs internal space writing), and before actual call make prior call of the same function writing for example a single zero byte a little far and after that perform the original request. The only thing that can be affected in this case is the routines that use xFileSize. I looked at the sources, at least backup Api uses it, it either should be straightforward in this case or "pretend" and report the expected file size. >From what I see, it seems that the sqlite internally doesn't rely on the file size for the core functionality and xFileSize looks more like a service function, but I may be wrong Max Vlasov, www.maxerist.net ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
> > On 12 Aug 2010, at 12:37pm, Pavel Ivanov wrote: > >>> Here I (or we) think of the cycles the system needs when the small >>> niche >>> of the initial database is exhausted and it has to look for another >>> free >>> block on the filesystem. If you can tell the system in advance, how big >>> the niche has to be, it saves some time. >> >> Did you measure that or know of someone who measured? > > Also, you may get a big saving on the first few accesses after taking > those measures, but does that last past any subsequent changes ? A > real-life example follows: > > Some people make a big fuss about defragmentation under Linux, and it's > true that if you compare access times immediately before and after > defragmentation you can see a noticable speedup. But 24 hours after that, > there's almost no difference because Linux continually writes many small > log files which fragments the disk again. So most of the speed > improvement goes away in the first 24 hours. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > And besides that. I think if you reserve the space every time your database grows faster, than the data, that was input into the database, or? And even if you reserve space, if the last space (from an earlier reserve) is not fully filled, you have an empty place there. it fills this one and goes to the other free place. So at the end it doesn't change anything. Am i right? Artur Reilin sqlite.yuedream.de ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
On 12 Aug 2010, at 12:37pm, Pavel Ivanov wrote: >> Here I (or we) think of the cycles the system needs when the small niche >> of the initial database is exhausted and it has to look for another free >> block on the filesystem. If you can tell the system in advance, how big >> the niche has to be, it saves some time. > > Did you measure that or know of someone who measured? Also, you may get a big saving on the first few accesses after taking those measures, but does that last past any subsequent changes ? A real-life example follows: Some people make a big fuss about defragmentation under Linux, and it's true that if you compare access times immediately before and after defragmentation you can see a noticable speedup. But 24 hours after that, there's almost no difference because Linux continually writes many small log files which fragments the disk again. So most of the speed improvement goes away in the first 24 hours. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
TeDe escribió: > Hello, > > I want to import a big subset of data from one database to a new one. I > attach the two databases together and use > > insert into customers select * from source.customers where name LIKE 'x%' > > I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? I'm looking > for a way to speed up the import. If whole database fits in ram you can: a) make a the new database :memory based, dump it at reload to a physical one b) make a ramdisk, create there the new db, dump when done and reload to a disk one Sqlite does this reservation already, but only a few pages. Perhaps you can modify the hard coded value. > Thanks in advance, > > Thomas ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
> The same with STL vectors: initializing it with a size is faster than > growing it element by element. That's pretty bad comparison. Initializing of vector with size is faster because it doesn't have to copy all elements later when it reallocates its memory. File system doesn't work that way, it doesn't change location of already written file data when you grow the file. Probably more appropriate comparison will be with STL deque. And depending on its internal implementation you can see very little improvement of reserving some size in deque compared to growing element by element. > Here I (or we) think of the cycles the system needs when the small niche > of the initial database is exhausted and it has to look for another free > block on the filesystem. If you can tell the system in advance, how big > the niche has to be, it saves some time. Did you measure that or know of someone who measured? I guess that can save some time (depending on file system implementation) if your disk is not fragmented at all and all free space is in one big chunk. And even in this case I doubt that speed up will be very big. But if your disk is heavily fragmented algorithm for allocating big file in the file system can be pretty much the same: find first free block, use it, if it's not enough find next one, use it etc. So it could be the same performance as when you grow your file block by block. Pavel On Thu, Aug 12, 2010 at 7:19 AM, TeDe wrote: > Hello Pawel, > > you made some good points. I'm still in the stage of evaluation, I don't > claim to know, its faster. But I saw that behavior on a filemanger: when > you copy a large file, it immediately reseveres the whole space. The > same with STL vectors: initializing it with a size is faster than > growing it element by element. > Therefor my question, if there is such a possibility. > >> I wouldn't be so sure about that. Did anybody make any measurements? >> 1) I don't know where do you think CPU cycles are saved but you >> definitely get some more CPU cycles in maintaining free-list of pages >> which will never be there if database grows page-by-page. > Here I (or we) think of the cycles the system needs when the small niche > of the initial database is exhausted and it has to look for another free > block on the filesystem. If you can tell the system in advance, how big > the niche has to be, it saves some time. >> 2) Even if you use Martin's technique by creating some big blob why do >> you think that SQLite will grow database file by necessary amount of >> pages at once instead of page-by-page? And is there something in >> SQLite's code that writes several sequential pages in one OS call >> instead of writing them page-by-page? > Thats indeed very uncertain. Here we have to look, how SQLite handles > this. If you have a transaction, SQLite could look, how much more space > is needed and preallocate this. >> I can agree that making file grow in one big piece instead of many >> small ones seems to compact most IO into one call instead of many. But >> will it be somehow faster? I doubt it. And bear in mind that all your >> efforts can be immediately trashed away by another process reading >> some big file(s) which will consume all OS file cache, so OS will have >> to re-read your database file later when you actually need it. This >> way I guess overall number of IO operations on the system will only >> increase... > That might be. We just can prove it one way: measure, measure, measure. > > I also don't expect huge performance increases, but if it is some > percent with little effort.. It could be worth it. > > Thomas > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
Am 12.08.2010 13:16, schrieb Martin.Engelschalk: > Am 12.08.2010 13:04, schrieb TeDe: >> Am 12.08.2010 12:16, schrieb Martin.Engelschalk: >>> Am 12.08.2010 12:08, schrieb TeDe: Hello, I want to import a big subset of data from one database to a new one. I attach the two databases together and use insert into customers select * from source.customers where name LIKE 'x%' I can approximately calculate, how big the new database will grow. Is there a way to tell SQLite to reserve an inital space or numer of pages instead of letting the database file grow again and again? I'm looking for a way to speed up the import. >>> Hello Thomas, >>> >>> I create a dummy table with a blob field and fill it with a very large >>> empty blob. Then I drop the table. The empty pages remain behind an can >>> the be used by the followimng inserts. >>> >> Hello Martin, >> >> that sounds like a good idea. Do you use it to have enough space for >> later operations or because you want to speed up the inserts? >> How big is the space you reserve by this and how much faster is it? I >> presume, you have to allocate quite big BLOBs. >> >> Best regards, >> >> Thomas > Hello Thomas, > > My primary goal was not a speedy insert but to avoid fragmentation of > the resulting database file, which slows down later access to the file. > So, this is not exactly on topic of your post. > I did not measure the changes in speed of the insert. However, later > selects, which in my case use practically all the data in the database, > speed up on the order of 20%. > I have to admit that this does not seem like much. However, my customer > for some reason did not like the fragmentation and insisted on a solution. Is that because of the lower fragmentation? Anyway, I consider 20% very good for the little effort you have to make, especially when you can re-use the code. If you have a well designed database and a well written application, it could become a hard job to squeeze out another 20% if you need them. Thomas ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
Hello Pawel, you made some good points. I'm still in the stage of evaluation, I don't claim to know, its faster. But I saw that behavior on a filemanger: when you copy a large file, it immediately reseveres the whole space. The same with STL vectors: initializing it with a size is faster than growing it element by element. Therefor my question, if there is such a possibility. > I wouldn't be so sure about that. Did anybody make any measurements? > 1) I don't know where do you think CPU cycles are saved but you > definitely get some more CPU cycles in maintaining free-list of pages > which will never be there if database grows page-by-page. Here I (or we) think of the cycles the system needs when the small niche of the initial database is exhausted and it has to look for another free block on the filesystem. If you can tell the system in advance, how big the niche has to be, it saves some time. > 2) Even if you use Martin's technique by creating some big blob why do > you think that SQLite will grow database file by necessary amount of > pages at once instead of page-by-page? And is there something in > SQLite's code that writes several sequential pages in one OS call > instead of writing them page-by-page? Thats indeed very uncertain. Here we have to look, how SQLite handles this. If you have a transaction, SQLite could look, how much more space is needed and preallocate this. > I can agree that making file grow in one big piece instead of many > small ones seems to compact most IO into one call instead of many. But > will it be somehow faster? I doubt it. And bear in mind that all your > efforts can be immediately trashed away by another process reading > some big file(s) which will consume all OS file cache, so OS will have > to re-read your database file later when you actually need it. This > way I guess overall number of IO operations on the system will only > increase... That might be. We just can prove it one way: measure, measure, measure. I also don't expect huge performance increases, but if it is some percent with little effort.. It could be worth it. Thomas ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
Am 12.08.2010 13:04, schrieb TeDe: > Am 12.08.2010 12:16, schrieb Martin.Engelschalk: >> Am 12.08.2010 12:08, schrieb TeDe: >>>Hello, >>> >>> I want to import a big subset of data from one database to a new one. I >>> attach the two databases together and use >>> >>> insert into customers select * from source.customers where name LIKE 'x%' >>> >>> I can approximately calculate, how big the new database will grow. Is >>> there a way to tell SQLite to reserve an inital space or numer of pages >>> instead of letting the database file grow again and again? I'm looking >>> for a way to speed up the import. >>> >> Hello Thomas, >> >> I create a dummy table with a blob field and fill it with a very large >> empty blob. Then I drop the table. The empty pages remain behind an can >> the be used by the followimng inserts. >> > Hello Martin, > > that sounds like a good idea. Do you use it to have enough space for > later operations or because you want to speed up the inserts? > How big is the space you reserve by this and how much faster is it? I > presume, you have to allocate quite big BLOBs. > > Best regards, > > Thomas Hello Thomas, My primary goal was not a speedy insert but to avoid fragmentation of the resulting database file, which slows down later access to the file. So, this is not exactly on topic of your post. I did not measure the changes in speed of the insert. However, later selects, which in my case use practically all the data in the database, speed up on the order of 20%. I have to admit that this does not seem like much. However, my customer for some reason did not like the fragmentation and insisted on a solution. I calculate the size of the blobs dynamically based on what I know of the following inserts. Also, i watch the result of pragma freelist_count. The size varies between 1 MB and 1 GB. Martin ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
Am 12.08.2010 12:16, schrieb Martin.Engelschalk: > > Am 12.08.2010 12:08, schrieb TeDe: >> Hello, >> >> I want to import a big subset of data from one database to a new one. I >> attach the two databases together and use >> >> insert into customers select * from source.customers where name LIKE 'x%' >> >> I can approximately calculate, how big the new database will grow. Is >> there a way to tell SQLite to reserve an inital space or numer of pages >> instead of letting the database file grow again and again? I'm looking >> for a way to speed up the import. >> > Hello Thomas, > > I create a dummy table with a blob field and fill it with a very large > empty blob. Then I drop the table. The empty pages remain behind an can > the be used by the followimng inserts. > Hello Martin, that sounds like a good idea. Do you use it to have enough space for later operations or because you want to speed up the inserts? How big is the space you reserve by this and how much faster is it? I presume, you have to allocate quite big BLOBs. Best regards, Thomas ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
> I can see where he is coming from. By reserving the appropriate number of > pages up front the import does not have to wait for disk IO or CPU cycles if > it runs out of pages. I wouldn't be so sure about that. Did anybody make any measurements? 1) I don't know where do you think CPU cycles are saved but you definitely get some more CPU cycles in maintaining free-list of pages which will never be there if database grows page-by-page. 2) Even if you use Martin's technique by creating some big blob why do you think that SQLite will grow database file by necessary amount of pages at once instead of page-by-page? And is there something in SQLite's code that writes several sequential pages in one OS call instead of writing them page-by-page? 3) Even if you don't use Martin's technique and indeed let database grow somehow in one system call why do you think that letting file grow by one big chunk uses somehow less IO and CPU cycles inside OS file system support than letting it grow piece-by-piece, or should I say sector-by-sector? I can agree that making file grow in one big piece instead of many small ones seems to compact most IO into one call instead of many. But will it be somehow faster? I doubt it. And bear in mind that all your efforts can be immediately trashed away by another process reading some big file(s) which will consume all OS file cache, so OS will have to re-read your database file later when you actually need it. This way I guess overall number of IO operations on the system will only increase... Pavel On Thu, Aug 12, 2010 at 6:34 AM, Timothy A. Sawyer wrote: > I can see where he is coming from. By reserving the appropriate number of > pages up front the import does not have to wait for disk IO or CPU cycles if > it runs out of pages. > > --Original Message-- > From: Pavel Ivanov > Sender: sqlite-users-boun...@sqlite.org > To: General Discussion of SQLite Database > ReplyTo: General Discussion of SQLite Database > Subject: Re: [sqlite] Reserve database pages > Sent: Aug 12, 2010 06:20 > >> I can approximately calculate, how big the new database will grow. Is >> there a way to tell SQLite to reserve an inital space or numer of pages >> instead of letting the database file grow again and again? I'm looking >> for a way to speed up the import. > > Why do you think that this kind of function will speed up your import? > > > Pavel > > On Thu, Aug 12, 2010 at 6:08 AM, TeDe wrote: >> Hello, >> >> I want to import a big subset of data from one database to a new one. I >> attach the two databases together and use >> >> insert into customers select * from source.customers where name LIKE 'x%' >> >> I can approximately calculate, how big the new database will grow. Is >> there a way to tell SQLite to reserve an inital space or numer of pages >> instead of letting the database file grow again and again? I'm looking >> for a way to speed up the import. >> >> Thanks in advance, >> >> Thomas >>___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ** Sent from my mobile device with the tiny keys ** > Timothy A. Sawyer, CISSP > Managing Director > MBD Consulting, LLC > 55 Madison Av., Suite 400 > Morristown, NJ 07960 > Phone: (973) 285-3261 | Fax: (973) 538-0503 > Web: http://www.mybowlingdiary.com > Email: tsaw...@mybowlingdiary.com > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
I can see where he is coming from. By reserving the appropriate number of pages up front the import does not have to wait for disk IO or CPU cycles if it runs out of pages. --Original Message-- From: Pavel Ivanov Sender: sqlite-users-boun...@sqlite.org To: General Discussion of SQLite Database ReplyTo: General Discussion of SQLite Database Subject: Re: [sqlite] Reserve database pages Sent: Aug 12, 2010 06:20 > I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? I'm looking > for a way to speed up the import. Why do you think that this kind of function will speed up your import? Pavel On Thu, Aug 12, 2010 at 6:08 AM, TeDe wrote: > Hello, > > I want to import a big subset of data from one database to a new one. I > attach the two databases together and use > > insert into customers select * from source.customers where name LIKE 'x%' > > I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? I'm looking > for a way to speed up the import. > > Thanks in advance, > > Thomas >___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ** Sent from my mobile device with the tiny keys ** Timothy A. Sawyer, CISSP Managing Director MBD Consulting, LLC 55 Madison Av., Suite 400 Morristown, NJ 07960 Phone: (973) 285-3261 | Fax: (973) 538-0503 Web: http://www.mybowlingdiary.com Email: tsaw...@mybowlingdiary.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
> I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? I'm looking > for a way to speed up the import. Why do you think that this kind of function will speed up your import? Pavel On Thu, Aug 12, 2010 at 6:08 AM, TeDe wrote: > Hello, > > I want to import a big subset of data from one database to a new one. I > attach the two databases together and use > > insert into customers select * from source.customers where name LIKE 'x%' > > I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? I'm looking > for a way to speed up the import. > > Thanks in advance, > > Thomas > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Reserve database pages
Am 12.08.2010 12:08, schrieb TeDe: > Hello, > > I want to import a big subset of data from one database to a new one. I > attach the two databases together and use > > insert into customers select * from source.customers where name LIKE 'x%' > > I can approximately calculate, how big the new database will grow. Is > there a way to tell SQLite to reserve an inital space or numer of pages > instead of letting the database file grow again and again? I'm looking > for a way to speed up the import. > > Thanks in advance, > > Thomas > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Hello Thomas, a short while ago there was a thread in this mailing list titled "Coping with database growth/fragmentation". Perhaps you want to search for it. AFAIK, there is no way to tell the Library to allocate a number of pages at once. I could use a function like this very well :-) I create a dummy table with a blob field and fill it with a very large empty blob. Then I drop the table. The empty pages remain behind an can the be used by the followimng inserts. Martin ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users