Keith Medcalf wrote: > Andy Goth wrote: >> There are two possibilities: >> >> 1. Transactions do work, but I'm misusing them and must learn how to be >> more careful. In this case, I will update documentation to properly >> explain their use to others. >> >> 2. Transactions don't work, at least not for my task. In this case, I >> will do my best to investigate the problem and suggest a correction. >> >> Either way, the right thing for me to do is continue digging in. >> >> Going from BEGIN to BEGIN IMMEDIATE indeed fixed the test program I >> wrote, but my application is still broken, and I don't know what's >> different about it. I'm working on instrumenting the fcntl() calls to >> log the sequence of operations. > > Make sure you are doing nothing foolish like using "shared cache".
I investigated shared cache because the notifier sounded like a fun time, but I quickly realized the cache sharing is only between different threads that are already sharing a single database connection. While I need to be prepared for multiple threads within an application, my problem at the moment is with multiple processes, each with their own connection, to a shared database file. > Make sure that you are using the default SERIALIZED or FULLMUTEX thread > management on all connections whether you think you know better or do > not need it or whatever. A few microseconds of CPU might save you many > MONTHS of trying to locate a defect in your code. I am using the default serialized mode. At first I was explicitly asking for FULLMUTEX, but when I traced through the execution I saw that it was merely reinforcing the default I was getting anyway. > If you think that you need to use either "shared cache" or some > threading mode other than the default serialized, then do that ONLY > AFTER your application is working using the defaults. Do not fall into > the premature optimization trap. It's vastly more cost-effective to optimize engineer time than CPU time. > Place all read and write statements which require a consistent database > context in the same transaction. Indeed, this is half the reason I'm using transactions to begin with. The other half is to avoid having to wait for a disk sync every row I insert. > IF THE TRANSACTION WILL OR MAY WRITE TO THE DATABASE AT SOME POINT > BEFORE COMMIT/ROLLBACK then start the transaction with BEGIN IMMEDIATE. > > [...] > > An attempt to UPGRADE a READ transaction to a WRITE transaction will > DEADLOCK (and immediately return an error) if another WRITE TRANSACTION > is pending. (Hence always signal your intent when commencing a > transaction by using the appropriate BEGIN syntax in order to avoid this > state of affairs). This is the major thing I was missing and was exactly the fault with the test program I posted. Adding "IMMEDIATE" completely fixed it. Oh, a bit more on that. I still could get SQLITE_BUSY in that test program, but only if I made it fork() so many times that the cumulative sync times added up to more than the busy timeouts I was setting. The later processes would give up if they had to wait for a hundred predecessors to insert their row. This is expected and correct operation, proving that SQLite really is serious about syncing to disk and that it is honoring the specified timeout. Of course, to make this happen, I was going off the deep end and forking hundreds of processes at once. Then when I increased the timeout to match, every last one of the processes was able to get its business done without complaint despite them all starting at the same time. Very nice work, SQLite. > WAL journalling only works where all database user processes are local > to each other (ie, between connections all originating from the same > computer -- where "same computer" means all processes have access to the > same shared memory, so that you may determine whether a SYSPLEX or NUMA > architecture constitutes "same computer" or not, by whether or not they > can all access the same "shared memory" region). I'm not using WAL. The nature of the application is that the initial database population phase is dominated by (sometimes simultaneous) writes, but after that point all access is purely read-only. > File locking only works reliably on local filesystems. It is unreliable > for *ALL* remote filesystems, even if the target "remote" filesystem is > local. Sadly, I am going to need NFS before I'm done, but since I know NFS is a problem, I'm sticking with local filesystems until things are stable. For this specific application, I'll bypass NFS during the population phase by keeping the database in /tmp, then I'll move it to NFS for read-only access. I have another application on the horizon that will need to be able to write over NFS, but I think it will be able to tolerate the theoretical possibility of losing data once in a blue moon. It'll be an append-only log where it's not a big deal if a record occasionally goes missing just because two users happened to run at the same time. If anyone disagrees, they can have me write a concurrency server to take the place of fcntl() locking. NFS is a really massive problem for me, more than you realize. I had everything working just fine over a year ago as a collection of Perl and Tcl scripts but then had to rewrite it all because of horrid bugs in the NFS driver on one particular computer, doing nasty things like break named pipes and randomly unmount filesystems and change current working directories to ***EMPTY STRING*** for no damn reason. And that's all on top of the expected problems like broken file locking. You better believe I'm upset with the admin for failing to address these problems. Just because I've been gradually making progress working around them doesn't mean they're not problems. The worst of it is knowing that the real fix is to just cut the number of servers in half and put the disk right inside the machine, since we don't even need NFS for what we're doing. It's only there because it's always been there, and why fix what isn't broken? And if a workaround exists, it's not broken! Even if the workaround is to babysit long-running processes and be prepared to restart them over and over until they don't get zapped by the filesystem driver for no discernible reason. > A local filesystem is defined as one wherein the filesystem code resides > and is executed "on the same computer" as the application. See the > requirement 2 above to determine the meaning of "same computer". The > cable between the "computer running the filesystem code" and the "block > storage device" may have a length varying from inches to many thousands > of miles. The command language the "filesystem" uses to access the > underlying block storage is irrelevant (it may be ATA, SATA, SCSI, SAS, > iSCSI, SAC (SCSI over Avian Carriers) or what ever your heart may > desire). I think I'm going to recommend that the admin switch from NFS to SAC in order to improve our latency and reliability. Plus, birds tend to leave nice audit logs wherever they go, whereas right now I can't tell what's happening to our data. Richard Damon wrote: > The experimental concurrency option might also help, in that a write > operation doesn't need an exclusive lock for the whole database, but > does introduce more points where a transaction might fail and need to be > rolled back. I honestly don't see any reason why using BEGIN IMMEDIATE shouldn't be the full solution for my problem. It makes perfect sense. Anything more is overkill and would constitute me falling in love with sophistication at the expense of getting the job done. Yet, I'm still having problems in my full application. I need to continue with that logger, I guess. Today has mostly been family time, so I've not made much progress. ------------------------------------------- CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving.. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users