At FOSDEM'11 there was some mumbling about LO's regression tests
taking two days to run due to some timeout kludgery necessitated by
occasional hangs at exit due to possible threading bugs.  Or something
like that.  I can't remember exactly.

Recently I've been improving Valgrind's Helgrind tool a bit, and I
thought I'd try it on a simple startup/exit of LO, to see what
happened.

It reports a whole bunch of lock order violations (potential
deadlocks) during both startup and shutdown, ending up with a thread
unlocking a not-locked lock, which doesn't sound good.

One thing I expected to see a lot of was false reports of races due to
release methods in thread-safe reference counted classes.  Helgrind
doesn't understand the implications of a 1 -> 0 refcount transition in
a release method -- that the calling thread is now the only owner, and
so can run the destructor without locking -- and requires that such
methods have a couple of lines of annotation explaining this.
However, I didn't see any races resulting from lack of such
annotations, which surprised me.  Surely some part of LO uses
threadsafe refcounted classes?

A bzip2'd text file containing the actual reports is attached.  It
also contains details of how to reproduce them.

I don't have time to chase these myself.  But I am happy to provide
guidance in the most effective use of Helgrind, if anyone else is
interested to chase them.

J

Attachment: helgrind-results-for-LO-1.txt.bz2
Description: application/bzip

_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice

Reply via email to