Keith Medcalf wrote:
> Andy Goth wrote:
>> There are two possibilities:
>>
>> 1. Transactions do work, but I'm misusing them and must learn how to be
>> more careful.  In this case, I will update documentation to properly
>> explain their use to others.
>>
>> 2. Transactions don't work, at least not for my task.  In this case, I
>> will do my best to investigate the problem and suggest a correction.
>>
>> Either way, the right thing for me to do is continue digging in.
>>
>> Going from BEGIN to BEGIN IMMEDIATE indeed fixed the test program I
>> wrote, but my application is still broken, and I don't know what's
>> different about it.  I'm working on instrumenting the fcntl() calls to
>> log the sequence of operations.
>
> Make sure you are doing nothing foolish like using "shared cache".

I investigated shared cache because the notifier sounded like a fun time, but I 
quickly realized the cache sharing is only between different threads that are 
already sharing a single database connection.  While I need to be prepared for 
multiple threads within an application, my problem at the moment is with 
multiple processes, each with their own connection, to a shared database file.

> Make sure that you are using the default SERIALIZED or FULLMUTEX thread
> management on all connections whether you think you know better or do
> not need it or whatever.  A few microseconds of CPU might save you many
> MONTHS of trying to locate a defect in your code.

I am using the default serialized mode.  At first I was explicitly asking for 
FULLMUTEX, but when I traced through the execution I saw that it was merely 
reinforcing the default I was getting anyway.

> If you think that you need to use either "shared cache" or some
> threading mode other than the default serialized, then do that ONLY
> AFTER your application is working using the defaults.  Do not fall into
> the premature optimization trap.

It's vastly more cost-effective to optimize engineer time than CPU time.

> Place all read and write statements which require a consistent database
> context in the same transaction.

Indeed, this is half the reason I'm using transactions to begin with.  The 
other half is to avoid having to wait for a disk sync every row I insert.

> IF THE TRANSACTION WILL OR MAY WRITE TO THE DATABASE AT SOME POINT
> BEFORE COMMIT/ROLLBACK then start the transaction with BEGIN IMMEDIATE.
>
> [...]
>
> An attempt to UPGRADE a READ transaction to a WRITE transaction will
> DEADLOCK (and immediately return an error) if another WRITE TRANSACTION
> is pending.  (Hence always signal your intent when commencing a
> transaction by using the appropriate BEGIN syntax in order to avoid this
> state of affairs).

This is the major thing I was missing and was exactly the fault with the test 
program I posted.  Adding "IMMEDIATE" completely fixed it.

Oh, a bit more on that.  I still could get SQLITE_BUSY in that test program, 
but only if I made it fork() so many times that the cumulative sync times added 
up to more than the busy timeouts I was setting.  The later processes would 
give up if they had to wait for a hundred predecessors to insert their row.  
This is expected and correct operation, proving that SQLite really is serious 
about syncing to disk and that it is honoring the specified timeout.  Of 
course, to make this happen, I was going off the deep end and forking hundreds 
of processes at once.  Then when I increased the timeout to match, every last 
one of the processes was able to get its business done without complaint 
despite them all starting at the same time.  Very nice work, SQLite.

> WAL journalling only works where all database user processes are local
> to each other (ie, between connections all originating from the same
> computer -- where "same computer" means all processes have access to the
> same shared memory, so that you may determine whether a SYSPLEX or NUMA
> architecture constitutes "same computer" or not, by whether or not they
> can all access the same "shared memory" region).

I'm not using WAL.  The nature of the application is that the initial database 
population phase is dominated by (sometimes simultaneous) writes, but after 
that point all access is purely read-only.

> File locking only works reliably on local filesystems.  It is unreliable
> for *ALL* remote filesystems, even if the target "remote" filesystem is
> local.

Sadly, I am going to need NFS before I'm done, but since I know NFS is a 
problem, I'm sticking with local filesystems until things are stable.  For this 
specific application, I'll bypass NFS during the population phase by keeping 
the database in /tmp, then I'll move it to NFS for read-only access.  I have 
another application on the horizon that will need to be able to write over NFS, 
but I think it will be able to tolerate the theoretical possibility of losing 
data once in a blue moon.  It'll be an append-only log where it's not a big 
deal if a record occasionally goes missing just because two users happened to 
run at the same time.  If anyone disagrees, they can have me write a 
concurrency server to take the place of fcntl() locking.

NFS is a really massive problem for me, more than you realize.  I had 
everything working just fine over a year ago as a collection of Perl and Tcl 
scripts but then had to rewrite it all because of horrid bugs in the NFS driver 
on one particular computer, doing nasty things like break named pipes and 
randomly unmount filesystems and change current working directories to ***EMPTY 
STRING*** for no damn reason.  And that's all on top of the expected problems 
like broken file locking.  You better believe I'm upset with the admin for 
failing to address these problems.  Just because I've been gradually making 
progress working around them doesn't mean they're not problems.  The worst of 
it is knowing that the real fix is to just cut the number of servers in half 
and put the disk right inside the machine, since we don't even need NFS for 
what we're doing.  It's only there because it's always been there, and why fix 
what isn't broken?  And if a workaround exists, it's not broken!  Even if the 
workaround is to babysit long-running processes and be prepared to restart them 
over and over until they don't get zapped by the filesystem driver for no 
discernible reason.

> A local filesystem is defined as one wherein the filesystem code resides
> and is executed "on the same computer" as the application.  See the
> requirement 2 above to determine the meaning of "same computer".  The
> cable between the "computer running the filesystem code" and the "block
> storage device" may have a length varying from inches to many thousands
> of miles.  The command language the "filesystem" uses to access the
> underlying block storage is irrelevant (it may be ATA, SATA, SCSI, SAS,
> iSCSI, SAC (SCSI over Avian Carriers) or what ever your heart may
> desire).

I think I'm going to recommend that the admin switch from NFS to SAC in order 
to improve our latency and reliability.  Plus, birds tend to leave nice audit 
logs wherever they go, whereas right now I can't tell what's happening to our 
data.

Richard Damon wrote:
> The experimental concurrency option might also help, in that a write
> operation doesn't need an exclusive lock for the whole database, but
> does introduce more points where a transaction might fail and need to be
> rolled back.

I honestly don't see any reason why using BEGIN IMMEDIATE shouldn't be the full 
solution for my problem.  It makes perfect sense.  Anything more is overkill 
and would constitute me falling in love with sophistication at the expense of 
getting the job done.

Yet, I'm still having problems in my full application.  I need to continue with 
that logger, I guess.  Today has mostly been family time, so I've not made much 
progress.
------------------------------------------- CONFIDENTIALITY NOTICE: This email 
and any attachments are for the sole use of the intended recipient and may 
contain material that is proprietary, confidential, privileged or otherwise 
legally protected or restricted under applicable government laws. Any review, 
disclosure, distributing or other use without expressed permission of the 
sender is strictly prohibited. If you are not the intended recipient, please 
contact the sender and delete all copies without reading, printing, or saving..
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to