I saw a couple of posts back that you are using MPI? Any chance that MPI is issuing a callback on a different thread? This could be an issue with c-interop and can be sometimes solved by following the steps in the thread safety <http://docs.julialang.org/en/release-0.3/manual/calling-c-and-fortran-code/#thread-safety> section of the manual.
On Sunday, September 21, 2014 1:44:23 PM UTC-4, Erik Schnetter wrote: > > Unfortunately I don't have a simple example that reproduces the > problem. So far, I've managed to whittle it down to an application > running in a single process without dependencies on external packages. > > -erik > > On Sun, Sep 21, 2014 at 1:04 PM, Tim Holy <tim....@gmail.com <javascript:>> > wrote: > > If you have/find a clean example, certainly posting an issue will make > sense. I > > can't comment on whether the task switch during I/O is inevitable. > > > > --Tim > > > > On Sunday, September 21, 2014 10:25:11 AM Erik Schnetter wrote: > >> I'm aware that Julia's threads are "green threads". The issue of > >> thread safety still remains; if one thread is suspended in a critical > >> region, another can enter that region. Storing handles in global data > >> structures and incrementing global variables are such actions, and I'm > >> not 100% sure that the respective region in serialize.jl are > >> yield-free, even without my info output. I was surprised to see that > >> I/O causes task switches -- maybe something else (hashing? > >> dictionaries? creating new lambdas in C?) also causes task switches? > >> > >> gdb points to memory allocation routines in libc, called from gc.c or > >> array.c. I assume that something overwrites memory, destroying libc > >> malloc's data structures, leading to a crash later. > >> > >> -erik > >> > >> On Sun, Sep 21, 2014 at 5:26 AM, Tim Holy <tim....@gmail.com > <javascript:>> wrote: > >> > Hi Erik, > >> > > >> > First, one comment: tasks are not "true" (kernel) threads. Currently > a > >> > julia process is single-threaded. Tasks are better considered as a > form > >> > of cooperative multitasking. > >> > > >> > Yes, I've also found that I/O causes task switching. I don't > personally > >> > know a great way around this. One option would presumably be to have > some > >> > form of message queue; I am pretty sure that push!ing a new message > on > >> > it---as long as you don't need to touch I/O to create the > message---would > >> > not cause a switch. You can also use time() and other markers to > indicate > >> > the status of control flow. > >> > > >> > I haven't been reading things carefully enough to know whether > there's any > >> > history behind this, but if you haven't said so already...what does > gdb > >> > (or > >> > equivalent) say about the segfault? > >> > > >> > --Tim > >> > > >> > On Saturday, September 20, 2014 08:24:59 PM Erik Schnetter wrote: > >> >> I am trying to track down a segfault in a Julia application. > Currently I > >> >> am > >> >> zooming in on "deserialize", as avoiding calling it seems to > reliably > >> >> cure > >> >> the problem, while calling it (even if not using the result) seems > to > >> >> reliably trigger the segfault. > >> >> > >> >> I am using many threads (tasks), and deserialize is called > concurrently. > >> >> Is > >> >> this safe? I've been bitten in the past by this; e.g. I've > accidentally > >> >> added an "info" statement into a sequence of statements that needs > to be > >> >> atomic, and I/O apparently switches threads. Is there a list of > >> >> known-to-be-safe or known-to-be-unsafe functions? Is deserialization > >> >> thread-safe in this respect? > >> >> > >> >> I am in particular deserializing function calls and lambda > expressions, > >> >> and > >> >> I see global variables ("lambda_numbers", "known_lambda_data"). Are > the > >> >> respective data structures (WeakKeyDict and Dict) thread-safe? > >> >> > >> >> Is there a locking mechanism in Julia? This would temporarily only > allow > >> >> a > >> >> single thread (task) to run, aborting with an error if this thread > >> >> becomes > >> >> unrunnable. In other words, calling "yield" when holding a lock > would be > >> >> a > >> >> no-op. > >> >> > >> >> -erik > > > > > > -- > Erik Schnetter <schn...@cct.lsu.edu <javascript:>> > http://www.perimeterinstitute.ca/personal/eschnetter/ >