Re: [sqlite] Performance with journal_mode = off
On 29.03.2013 19:42, Jeff Archer wrote: From: James K. Lowden jklow...@schemamania.org To: sqlite-users@sqlite.org Your experiment is telling you different: transaction control costs more than I/O. But shouldn't transactions be disabled when journal_mode = off? Maybe that is a faulty assumption. If so, what is the point of journal_mode = off? For this purpose, I am very happy to give all the ACID promises. If I understand your point #2, I think you are saying that all of the inserts within a single transaction are not written to the disk (database or journal) until the transaction is committed. But that can't quite be the answer because if kept my transaction open long enough I would simple run out of memory and that doesn't seem to happen even when I have 1 million plus inserts. ___ If you keep your transaction open look at the database file size you'll see that the changes aren't written to the File until you commit to disk. 1 Million records are to few for modern systems to reach the out of memory limit. Lets say your records have 1 k size that would make 1G of Memory with overhead and virtual memory why would you expect out of memory ??? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
From : http://www.sqlite.org/pragma.html#pragma_journal_mode The OFF journaling mode disables the rollback journal completely. No rollback journal is ever created and hence there is never a rollback journal to delete. The OFF journaling mode disables the atomic commit and rollback capabilities of SQLite. The ROLLBACK http://www.sqlite.org/lang_transaction.html command no longer works; it behaves in an undefined way. Applications must avoid using the ROLLBACK http://www.sqlite.org/lang_transaction.html command when the journal mode is OFF. If the application crashes in the middle of a transaction when the OFF journaling mode is set, then the database file will very likely go corrupt. Meaning : You can use Transaction also with journal mode OFF. then there will be no journal file there will be no ROLLBACK command no atomic commit (see description of commit mechanism from the prior sent link) a transaction which is possible can leave you with a corrupt database if your application crashes due to software or power failures if you are sure you don't need the security of a journal file cause your application is crash proof and you won't get a power disk failure (assumption is a gun to shoot your feet) you can ommit a journal file but you will be able to use transacations. transactions improve your performance for bulk data transfer into a database significantly. hope this will help ;) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
In reference to your needs one more suggestion : If you put the blobs outside of a sqlite database and store your householding, indexing data inside your sqlite data i would suggest to use journal mode = on because your journal file and database file without the blobs has a small amount and if you loose your database due to a crash inside your application you would possibly loose the housekeeping data for your raw data in files and that would make the whole really big amount of storage useless. Keep your indexes and housekeeping data secure (my advice) if you don't want to gamble ! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 28.03.2013 14:03, Jeff Archer wrote: On Thu, Mar 28, 2013 at 8:24 AM, ibrahim ibrahim.a...@googlemail.com wrote: On 28.03.2013 13:09, Jeff Archer wrote: On Wed, Mar 27, 2013 at 6:35 PM, Simon Slavin slav...@bigfraud.org wrote: Reasonable figures. With 5764 writes to the disk in separates transactions you have quite a lot of reading of data plus 5764 attempts to update the database file. The updates have to be done in the right order, and each update has to wait for the disk to be rotated into the right position, though each update changes only a small amount of data (probably two sectors). But my most basic question remains. Why is single transaction faster than PRAGMA journal_mode = off? Seems to me that with no journal there should only be single set of writes to the actual db and that journaling would double the number of writes because data has to be written to journal file as well. 2.5 sec with journal 5.5 sec without journal = seems like this sould be the smaller number You should read the sections 3 forward. http://www.sqlite.org/atomiccommit.html A single transaction happens mostly in memory then is flushed to the OS Cache in one step. The Journal file (the amount of pages that will be changed) is small while inserting new data into a database and the OS File Cache is usually large enough ... Yes, I have read this. (And now re-read it) So, since much more work must be done when using journal file, why does it take longer to do the inserts when there is NO journal file? ___ You compare different approaches. You should compare a) Journal mode=on w/wo single transaction b) Journal mode=off w/wo single transaction Single transaction outweights journal mode in terms of performance. It reduces the amount of necessary physical read write operations on the OS File Cache and the Disk. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
From: James K. Lowden jklow...@schemamania.org To: sqlite-users@sqlite.org Your experiment is telling you different: transaction control costs more than I/O. But shouldn't transactions be disabled when journal_mode = off? Maybe that is a faulty assumption. If so, what is the point of journal_mode = off? For this purpose, I am very happy to give all the ACID promises. If I understand your point #2, I think you are saying that all of the inserts within a single transaction are not written to the disk (database or journal) until the transaction is committed. But that can't quite be the answer because if kept my transaction open long enough I would simple run out of memory and that doesn't seem to happen even when I have 1 million plus inserts. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
Date: Thu, 28 Mar 2013 14:13:51 +0100 From: ibrahim ibrahim.a...@googlemail.com ... You should compare a) Journal mode=on w/wo single transaction b) Journal mode=off w/wo single transaction So, this means I can use transaction when I have journal_mode = off? I did not understand that this was a possibility? I have re-read the PRAGMA journal_mode documentation and still it sounds to me like this cannot be done: The OFF journaling mode disables the atomic commit and rollback capabilities of SQLite. Maybe I am not reading throughly... Maybe all it truely says definitively is Applications must avoid using the ROLLBACK command when the journal mode is OFF. I will rerun my experiment with a single transaction with journal_mode = off. Thanks. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 29 Mar 2013, at 6:42pm, Jeff Archer jsarc...@nanotronicsimaging.com wrote: But shouldn't transactions be disabled when journal_mode = off? No. The two things have nothing to do with one-another. Transactions are about grouping changes together so that either they all happen or none happen. Journalling is about surviving through crashes and power-failures, and being able to undo changes. Transactions are about programming. Journalling is about hardware failure. Either one can be useful without the other. Having seen you ask this question probably explains your earlier confusion. So yes, if you're doing trials, you have four test cases: (transactions on and off) times (journalling on and off) = 2 x 2. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Fri, Mar 29, 2013 at 4:05 PM, Simon Slavin slav...@bigfraud.org wrote: No. The two things have nothing to do with one-another. Transactions are about grouping changes together so that either they all happen or none happen. Journalling is about surviving through crashes and power-failures, and being able to undo changes. Transactions are about programming. Journalling is about hardware failure. Either one can be useful without the other. Thank you. This helps my understanding. This would be good information to have on the documentation page for PRAGMA journal_mode; Or, maybe this is common knowledge and I just illustrate my ignorance. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Wed, Mar 27, 2013 at 6:35 PM, Simon Slavin slav...@bigfraud.org wrote: Reasonable figures. With 5764 writes to the disk in separates transactions you have quite a lot of reading of data plus 5764 attempts to update the database file. The updates have to be done in the right order, and each update has to wait for the disk to be rotated into the right position, though each update changes only a small amount of data (probably two sectors). But my most basic question remains. Why is single transaction faster than PRAGMA journal_mode = off? Seems to me that with no journal there should only be single set of writes to the actual db and that journaling would double the number of writes because data has to be written to journal file as well. 2.5 sec with journal 5.5 sec without journal = seems like this sould be the smaller number ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 28 Mar 2013, at 12:09, Jeff Archer wrote: But my most basic question remains. Why is single transaction faster than PRAGMA journal_mode = off? Seems to me that with no journal there should only be single set of writes to the actual db and that journaling would double the number of writes because data has to be written to journal file as well. 2.5 sec with journal 5.5 sec without journal = seems like this sould be the smaller number Your base-line for comparison is the case of multiple transactions with journalling. When you turn off journalling, you save something; when you consolidate the activity into a single transaction, you save something else. What you're seeing is that the saving achieved with reference to your base-line measurement by using a single transaction exceeds that achieved by disabling journalling. /Niall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Thu, Mar 28, 2013 at 8:35 AM, Niall O'Reilly niall.orei...@ucd.ie wrote: On 28 Mar 2013, at 12:09, Jeff Archer wrote: But my most basic question remains. Why is single transaction faster than PRAGMA journal_mode = off? Seems to me that with no journal there should only be single set of writes to the actual db and that journaling would double the number of writes because data has to be written to journal file as well. 2.5 sec with journal 5.5 sec without journal = seems like this sould be the smaller number Your base-line for comparison is the case of multiple transactions with journalling. When you turn off journalling, you save something; when you consolidate the activity into a single transaction, you save something else. What you're seeing is that the saving achieved with reference to your base-line measurement by using a single transaction exceeds that achieved by disabling journalling. /Niall Agreed. But the question is: Why? Without journaling only half as much writing to disk should occur so why would it take longer? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 27.03.2013 22:55, Jeff Archer wrote: On Wed, Mar 27, 2013 at 5:46 PM, David King dk...@ketralnis.com wrote: I am populating a database with 5764 records using the exact same data set each time into a newly created file. When I use no explicit transactions (default atomic commit) it takes 17.7 seconds. When I set journal_mode = off, same operation takes 5.5 seconds. If I do all 5764 inserts within a single transaction only 2.5 seconds. That sounds about right, yeah. With journalling, most disk writes have to be done twice (once to the journal and once to the data file). Which is why I expected journal_mode = off to make it faster. But it is 3 seconds faster when I leave journaling enabled and do all writes within a single transaction. Seems like you have defined some sort of index. You will get faster bulk inserts when you create your indexes afterwards. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 28.03.2013 13:09, Jeff Archer wrote: On Wed, Mar 27, 2013 at 6:35 PM, Simon Slavin slav...@bigfraud.org wrote: Reasonable figures. With 5764 writes to the disk in separates transactions you have quite a lot of reading of data plus 5764 attempts to update the database file. The updates have to be done in the right order, and each update has to wait for the disk to be rotated into the right position, though each update changes only a small amount of data (probably two sectors). But my most basic question remains. Why is single transaction faster than PRAGMA journal_mode = off? Seems to me that with no journal there should only be single set of writes to the actual db and that journaling would double the number of writes because data has to be written to journal file as well. 2.5 sec with journal 5.5 sec without journal = seems like this sould be the smaller number You should read the sections 3 forward. http://www.sqlite.org/atomiccommit.html A single transaction happens mostly in memory then is flushed to the OS Cache in one step. The Journal file (the amount of pages that will be changed) is small while inserting new data into a database and the OS File Cache is usually large enough ... ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Thu, Mar 28, 2013 at 8:24 AM, ibrahim ibrahim.a...@googlemail.com wrote: On 28.03.2013 13:09, Jeff Archer wrote: On Wed, Mar 27, 2013 at 6:35 PM, Simon Slavin slav...@bigfraud.org wrote: Reasonable figures. With 5764 writes to the disk in separates transactions you have quite a lot of reading of data plus 5764 attempts to update the database file. The updates have to be done in the right order, and each update has to wait for the disk to be rotated into the right position, though each update changes only a small amount of data (probably two sectors). But my most basic question remains. Why is single transaction faster than PRAGMA journal_mode = off? Seems to me that with no journal there should only be single set of writes to the actual db and that journaling would double the number of writes because data has to be written to journal file as well. 2.5 sec with journal 5.5 sec without journal = seems like this sould be the smaller number You should read the sections 3 forward. http://www.sqlite.org/atomiccommit.html A single transaction happens mostly in memory then is flushed to the OS Cache in one step. The Journal file (the amount of pages that will be changed) is small while inserting new data into a database and the OS File Cache is usually large enough ... Yes, I have read this. (And now re-read it) So, since much more work must be done when using journal file, why does it take longer to do the inserts when there is NO journal file? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Thu, Mar 28, 2013 at 9:03 AM, Jeff Archer jsarc...@nanotronicsimaging.com wrote: Yes, I have read this. (And now re-read it) So, since much more work must be done when using journal file, why does it take longer to do the inserts when there is NO journal file? Much work must also be done to commit, with our without a journal. So if you only commit once, that runs faster than committing 5764 times. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 28 Mar 2013, at 12:43pm, Jeff Archer jsarc...@nanotronicsimaging.com wrote: But the question is: Why? Without journaling only half as much writing to disk should occur so why would it take longer? But you are still doing 5764 sets of writing. One per transaction (which, since you are not defining transactions yourself is actually one per SQL command). When you define a single transaction you're actually doing just one write: the only thing that requires a write to disk is your COMMIT command. So naturally doing one write takes less time than doing 5764 write. Journalling isn't relevant to the above. Journals just add extra to the above, no matter whether you're doing a single transaction or many transactions. But they don't make anything like as much difference as the difference between 1 transaction and 5764 transactions. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Thu, 28 Mar 2013 08:43:50 -0400 Jeff Archer jsarc...@nanotronicsimaging.com wrote: When you turn off journalling, you save something; when you consolidate the activity into a single transaction, you save something else. What you're seeing is that the saving achieved with reference to your base-line measurement by using a single transaction exceeds that achieved by disabling journalling. /Niall Agreed. But the question is: Why? Without journaling only half as much writing to disk should occur so why would it take longer? Jeff, you seem to working from the assumption that turning off journalling should be the secret sauce because I/O is (or should be) the dominant factor. Your experiment is telling you different: transaction control costs more than I/O. There are two ways you can think about it. 1. Look at the code path through SQLite to perform an insert. If you're like me, you'll give up before you get all the way through. I don't see any wasted motion, but there's a lot of work. 2. Consider the analogy to write(2). Many small writes take longer than one large one (up to some value of large). Without testing on your system, I can tell you that writing 1 byte 5764 times, will take more than 2X the time to write 5764 bytes one time. The OS has some overhead to handling a write operation, yet makes no ACID promise. (Not even durability!) SQLite makes a higher promise, if you will, at a higher price. Per transaction. HTH. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Performance with journal_mode = off
Could someone please confirm if this makes sense. It is not what I expected. I have repeated several times so I believe these are the correct numbers. I am populating a database with 5764 records using the exact same data set each time into a newly created file. When I use no explicit transactions (default atomic commit) it takes 17.7 seconds. When I set journal_mode = off, same operation takes 5.5 seconds. If I do all 5764 inserts within a single transaction only 2.5 seconds. Jeff Archer Vice President of Software Engineering Nanotronics Imaging jsarc...@nanotronicsimaging.com 330819.4615 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
I am populating a database with 5764 records using the exact same data set each time into a newly created file. When I use no explicit transactions (default atomic commit) it takes 17.7 seconds. When I set journal_mode = off, same operation takes 5.5 seconds. If I do all 5764 inserts within a single transaction only 2.5 seconds. That sounds about right, yeah. With journalling, most disk writes have to be done twice (once to the journal and once to the data file). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Wed, Mar 27, 2013 at 5:46 PM, David King dk...@ketralnis.com wrote: I am populating a database with 5764 records using the exact same data set each time into a newly created file. When I use no explicit transactions (default atomic commit) it takes 17.7 seconds. When I set journal_mode = off, same operation takes 5.5 seconds. If I do all 5764 inserts within a single transaction only 2.5 seconds. That sounds about right, yeah. With journalling, most disk writes have to be done twice (once to the journal and once to the data file). Which is why I expected journal_mode = off to make it faster. But it is 3 seconds faster when I leave journaling enabled and do all writes within a single transaction. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On 27 Mar 2013, at 9:55pm, Jeff Archer jsarc...@nanotronicsimaging.com wrote: Which is why I expected journal_mode = off to make it faster. But it is 3 seconds faster when I leave journaling enabled and do all writes within a single transaction. When I set journal_mode = off, same operation takes 5.5 seconds. If I do all 5764 inserts within a single transaction only 2.5 seconds. Reasonable figures. With 5764 writes to the disk in separates transactions you have quite a lot of reading of data plus 5764 attempts to update the database file. The updates have to be done in the right order, and each update has to wait for the disk to be rotated into the right position, though each update changes only a small amount of data (probably two sectors). With 5764 writes to the disk in the same transaction, you have the same amount of reading needed, but only 1 attempt to write to the database file. Although all the same sectors need writing, each one needs writing once only and the sectors can be written in any order (your drivers will probably optimize the order or writes they happen in whatever order will be fastest). So yes, you save more time by doing your updates in one transaction than doing them separately. In fact I'd thought that your figures would be further apart than 55%. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance with journal_mode = off
On Wed, 27 Mar 2013 17:55:00 -0400, Jeff Archer jsarc...@nanotronicsimaging.com wrote: On Wed, Mar 27, 2013 at 5:46 PM, David King dk...@ketralnis.com wrote: I am populating a database with 5764 records using the exact same data set each time into a newly created file. When I use no explicit transactions (default atomic commit) it takes 17.7 seconds. When I set journal_mode = off, same operation takes 5.5 seconds. If I do all 5764 inserts within a single transaction only 2.5 seconds. That sounds about right, yeah. With journalling, most disk writes have to be done twice (once to the journal and once to the data file). Which is why I expected journal_mode = off to make it faster. But it is 3 seconds faster when I leave journaling enabled and do all writes within a single transaction. I miss one test case: both journal_mode = off, and all 5764 inserts within a single transaction. Give that one a thought (and a try) and you'll understand the difference. -- Groet, Cordialement, Pozdrawiam, Regards, Kees Nuyt ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users