Re: [sqlite] How to use sqlite and pthread together?

Samuel Adam Sun, 20 Feb 2011 16:36:46 -0800

On Sun, 20 Feb 2011 14:46:06 -0500, Nico Williams <n...@cryptonector.com>  
wrote:


> On Sun, Feb 20, 2011 at 6:28 AM, Samuel Adam <a...@certifound.com> wrote:
>> On Sat, 19 Feb 2011 17:12:31 -0500, Pavel Ivanov <paiva...@gmail.com>
>> wrote:
>>
>>> [snip] On
>>> Windows it’s different - process is much more heavy-weight object than
>>> thread and involves much bigger system load to support it. There’s an
>>> official general advice for Windows: better create a new thread in the
>>> same process than a new process.
>>
>> Mr. Ivanov explained what I was saying better than I did.  My unclear
>> offhand comment about fork()/exec() was an allusion to why *nix  
>> developed
>> much lighter-weight processes than Windows, viz., decades of a
>> fork()/exec() custom and practice.  (Indeed, I believe that’s precisely
>> why Linux went to the trouble of re-engineering fork() with COW.)  I
>> intended to address the overhead of running, and inadvertently  
>> introduced
>> a red herring about overhead of starting.
>
> You seem to be conflating the weightiness of a notion of process with
> the weightiness of interfaces for creating processes.

I appreciate your extensive (if wildly offtopic) analysis as quoted  
below.  You thoroughly misunderstood what I said, though.  Again, my  
fork()/exec() comment was directed to the same “cultural thing” as you  
spoke about in a different context; and my object thereby was to posit  
__why__ *nix kernel developers have more incentive to make sure processes  
run light.  Winapi doesn’t offer a really equivalent pair of syscalls, nor  
an extensive existing fork-exec practice, so NT kernel developers needn’t  
optimize that use case; whereas *nix kernel folks must of practical  
necessity design their process models to support a typical *nix code  
pattern.  If they do not so do, their users will complain bitterly about  
the overhead of all their daemons’ zillion workers *after* those workers  
are started with the classic fork()/exec().

This being off-topic as it is, I must decline to continue discussing OS  
process practice in front of 10,000 or so people (or so I heard) who tuned  
in for discussion about SQLite.  You said some very interesting stuff,  
though, particularly as to the TLB.  I’d like to leave the door open to  
engaging such discussions in an appropriate venue sometime (ENOTIME for  
the foreseeable future).

Very truly,

Samuel Adam ◊ <http://certifound.com/>
763 Montgomery Road ◊ Hillsborough, NJ  08844-1304 ◊ United States
Legal advice from a non-lawyer: “If you are sued, don’t do what the
Supreme Court of New Jersey, its agents, and its officers did.”
http://www.youtube.com/watch?v=iT2hEwBfU1g


>
> fork() has nothing to do with whether a notion of process is
> light-weight or not.  And quite aside from that, fork() is only as
> light-weight as the writable resident set size of the parent process.
> Long, long ago fork() would copy the parent's address space.  Later on
> fork() implementations started marking what should be writable pages
> as read-only in the MMU page table entries for the process in order to
> catch writes and then copy-on-write.  COW works fine for
> single-threaded processes when the child of fork() intends to exec()
> or exit() immediately and the parent is willing to wait for the child
> to do so.  But for a heavily multi-threaded process with a huge RSS,
> such as a web browser, COW is a performance disaster as it means
> cross-calls to do MMU TLB shoot down, and then incurring a potentially
> large number of page faults in the parent as those threads continue
> executing.  Nowadays it's often simpler and faster to just copy the
> writable portion of the parent's RSS...  vfork(), OTOH, need only
> result in cross-calls to stop the parent's threads, but no page table
> manipulations, TLB shootdowns, data copies, nor page faults need be
> incurred.  And a true posix_spawn() wouldn't even have to stop the
> parent's threads (but using vfork() makes posix_spawn perform so well
> compared to fork() that, for example, Solaris' posix_spawn() just uses
> vfork()).  In Solaris, for example, we've obtained major performance
> improvements by having applications such as web browsers use
> posix_spawn() or vfork() in preference to fork().
>
> In any case, fork() is not an essential attribute of an operating
> system's notion of "process", but an incidental one (related to how
> one creates processes).  In terms of essential attributes, Unix and
> Windows processes compare to each other, and Windows and Unix threads
> (POSIX threads) also compare to each other (roughly anyways, as some
> pthreads implementations have M:N mappings to kernel constructs while
> others have 1:1, and so on).  Yes, Linux has clone(2), which allows
> one to decide just what parts of the parent's various attributes the
> child will share with the parent or get a copy of from the parent, but
> because the standard is pthreads, in practice most developers on Linux
> constrain themselves to using pthreads, thus the concept of clone(2)
> is not that relevant here.
>
>> Speaking as a user, by the way, I don’t think I actually have *any*
>> Windows applications which use worker processes for concurrency the same
>> way my *nix server daemons do.  There’s a reason for that.
>
> It's largely a cultural thing.  Windows NT and up had and promoted
> threading from the get-go, while Unix had a very long tradition of
> single-threaded processes, and some Unix systems had to catch up to
> Windows regarding multi-threading.  There are many other factors
> leading to this dichotomy, such as the fact that Unix developers tend
> to appreciate isolation, the fact that Window's process spawn API is
> so complex and difficult to use, the fact that Windows allows
> individual threads of a process to execute with different access
> tokens in effect (thus reducing the need to start additional
> processes, even if this means losing a lot on the isolation front),
> etcetera.  OTOH I've no reason to believe that this split has anything
> to do with the weightiness of Windows processes vs. Unix ones (though
> the complexity of creating new processes certainly is involved).  But
> we do get many heavily multi-threaded applications on Unix nowadays.
> Perhaps the Windows model won out... threads _are_ easier to get
> started with than processes.
>
> Obtopic: I've successfully used SQLite_2_ with pthreads, and have
> every reason to believe that it is possible to safely and productively
> use SQLite3 with pthreads.  The key for using SQLite3 in a
> multi-threaded way is to adhere to good threaded programming
> guidelines while thoroughly understanding the APIs you choose to use.
> In particular you should do your utmost to minimize the use of any
> sharing of state across threads, and you should understand the
> pthreads APIs as well as membars and atomic operations provided
> outside pthreads.
>
> Nico
> --
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to use sqlite and pthread together?

Reply via email to