I am trying to debug a lockup problem (and need help with debugging technique),
where hang means a thread stops at a known place during evaluation, and other
threads continue.
The code near the problem is like:
ec <- return $ encode command
l <- return $ BSL.length ec
ss <- return $ BSC.unpack ec
It does not matter if I use let or return, or if the length is taken after
unpack. I used return so I could use this code for tracing, with strictness to
try to find the exact statement that is the problem:
traceEventIO "sendCommand"
ec <- return $ encode command
traceEventIO $ "sendCommand: encoded"
l <- ec `seq` return $ BSL.length ec
traceEventIO $ "sendCommand: size " ++ (show l)
ss <- ec `seq` return $ BSC.unpack ec
When this runs, the program executes this many times, but always hangs under a
certain condition.
For good evaluations:
7f04173ff700: cap 0: sendCommand
7f04173ff700: cap 0: sendCommand: encoded
7f04173ff700: cap 0: sendCommand: size 4
7f04173ff700: cap 0: sendCommand: unpacked
7f04173ff700: cap 0: Sending command of size 4
7f04173ff700: cap 0: Sending command of size "\NUL\EOT"
7f04173ff700: cap 0: sendCommand: sent
7f04173ff700: cap 0: sendCommand: flushed
for bad evaluation:
7f04173ff700: cap 0: sendCommand
7f04173ff700: cap 0: sendCommand: encoded
The lockup occurs when length is taken.
The difference between the working and non-working case is as follows:
A wxHaskell callback stuffs some data in a TChan. A thread started at
application startup is reading the TChan and calling the code that hangs. If it
did not hang, it would send it by TCP to another process/computer.
In the working case the callback pops a dialog, and passes data from one TChan
to another TChan.
In the failing case, the data is used to generate strings in a wxHaskell grid,
then it is parsed, and a new data is made. The new data is a combination of old
and new pieces of sub data. The shape of the date is identical, because I am
not making any edits to the rows.
So when data that the callback sends to TChan is unmodified, no hang. But when
the data is used to make text, put it in the gui, process it, and generate new
data, it hangs.
As a test I modified the code so that the text is not put into the gui. The
results are the same. This indicates it has something to do with creating
strings and then data from strings and mixing old and new subdata. Strings are
created with show. Data is created by pattern matching and generating numbers
from strings. I should also point out that in the working case, the size of the
resulting string is small, say 3. In the hang case, the resulting string would
be long, say 5000-10000.
I assume there are no limits to the size of ByteStrings or fundemental issues
with the RTS stack/heap that require special settings.
I am using the following revisions:
GHC 7.8.3
base ==4.7.*,
mtl ==2.2.1,
containers == 0.5.5.1,
transformers ==0.4.1.0,
random == 1.0.1.1,
wx == 0.91.0.0,
wxcore == 0.91.0.0,
wxdirect == 0.91.0.0,
colour == 2.3.3,
stm == 2.4.2,
monad-loops == 0.4.2.1,
time == 1.4.2,
old-locale == 1.0.0.6,
fast-logger == 2.2.3,
network == 2.6.0.2,
bytestring == 0.10.4.0,
control-monad-loop == 0.1,
binary == 0.7.2.2,
I know that nobody can have an answer based on this. But what I am hoping is
either there is some known bug, or someone can guide me in narrowing it down.
The event log does not have anything unusual in it. Other threads keep running,
and I can exit the application normally. The thread does not throw an
exception. It just hangs.
When I run the app, I just use +RTS -v
Perhaps there are some other options that might give more info?
— SNIPPET of log —
7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
7fe544cfea40: cap 0: thread 5 stopped (yielding)
7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
7fe544cfea40: cap 0: thread 5 stopped (suspended while making a foreign call)
7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar)
7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
7fe537eff700: cap 0: waking up thread 5 on cap 0
7fe537eff700: cap 0: thread 2 stopped (yielding)
7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC)
7fe544cfea40: cap 0: sendCommand
7fe544cfea40: cap 0: sendCommand: encoded
7fe544cfea40: cap 0: sendCommand: size 3 WORKS
HERE
7fe544cfea40: cap 0: sendCommand: unpacked
7fe544cfea40: cap 0: Sending command of size 3
7fe544cfea40: cap 0: Sending command of size "\NUL\ETX"
7fe544cfea40: cap 0: sendCommand: sent
7fe544cfea40: cap 0: sendCommand: flushed
7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar)
7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
7fe537eff700: cap 0: thread 2 stopped (yielding)
7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
7fe537eff700: cap 0: thread 2 stopped (suspended while making a foreign call)
7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
7fe537eff700: cap 0: waking up thread 5 on cap 0
…
7fe537eff700: cap 0: fetchTelemetryServer: got lock
7fe537eff700: cap 0: thread 45 stopped (yielding)
7fe537eff700: cap 0: running thread 7 (ThreadRunGHC)
7fe537eff700: cap 0: thread 7 stopped (heap overflow)
7fe537eff700: cap 0: requesting parallel GC
7fe537eff700: cap 0: starting GC
7fe537eff700: cap 0: GC working
7fe537eff700: cap 0: GC idle
7fe537eff700: cap 0: GC done
7fe537eff700: cap 0: GC idle
7fe537eff700: cap 0: GC done
7fe537eff700: cap 0: all caps stopped for GC
7fe537eff700: cap 0: finished GC
7fe537eff700: cap 0: running thread 7 (ThreadRunGHC)
7fe537eff700: cap 0: sendCommand
7fe537eff700: cap 0: sendCommand: encoded PROBLEM HERE
7fe537eff700: cap 0: thread 7 stopped (heap overflow)
7fe537eff700: cap 0: requesting parallel GC
7fe537eff700: cap 0: starting GC
7fe537eff700: cap 0: GC working
7fe537eff700: cap 0: GC idle
7fe537eff700: cap 0: GC done
7fe537eff700: cap 0: GC idle
7fe537eff700: cap 0: GC done
7fe537eff700: cap 0: all caps stopped for GC
7fe537eff700: cap 0: finished GC
7fe537eff700: cap 0: running thread 7 (ThreadRunGHC)
7fe537eff700: cap 0: thread 7 stopped (yielding)
7fe537eff700: cap 0: running thread 2 (ThreadRunGHC)
7fe537eff700: cap 0: thread 2 stopped (yielding)
7fe544cfea40: cap 0: running thread 442 (ThreadRunGHC)
7fe544cfea40: cap 0: thread 442 stopped (suspended while making a foreign call)
7fe544cfea40: cap 0: running thread 442 (ThreadRunGHC)
7fe544cfea40: cap 0: thread 442 stopped (suspended while making a foreign call)
7fe537eff700: cap 0: running thread 45 (ThreadRunGHC)
7fe537eff700: cap 0: fetchTelemetryServer: unlock
7fe537eff700: cap 0: fetchTelemetryServer
7fe537eff700: cap 0: fetchTelemetryServer: got lock
7fe537eff700: cap 0: fetchTelemetryServer: pump seq
_______________________________________________
Glasgow-haskell-users mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users