RE: Help/advise asked with deadlocks when opening Visio objects in Writer

2014-01-15 Thread Winfried Donkers
Hi Michael,

>> Since we started using LibreOffice versions later than 3.5 (I think), 
>> we have seemingly random problems with LibreOffice freezing when 
>> opening an embedded Visio object.
> Ok - do you have a stack-trace ? if you run 4.2 of course you can get 
> debugging symbols for that which would make the problem debuggable at least - 
> a deadlock is a beast that gives a nice stack trace (if you get all threads).

Michael Stahl pointed me to winDbg and Kendy's wiki; currently we are 
trying to reproduce the problem and produce a stack-trace. When we succeed, I 
will create a bug report with it and cc you and Michael Stahl.

>> Today, I had a breakthrough: a colleague reported that he received an 
>> error message, "Algemene OLE fout". Normally, we don't get that, 
>> LibreOffice just freezes.
>> This string and opengrok led to ERRCODE_SO_GENERALERROR  belonging to 
>> the string, /core/sfx2/source/view/ipclient.cxx, which is the only file 
>> where ERRCODE_SO_GENERALERROR is used.
> How do you get from there to here:
>> This led to a TODO-comment in 
>> /core/embeddedobj/source/commonembedding/embobj.cxx, 
>> OCommonEmbeddedObject::doVerb( ... ):

The try statements above the catch where the error code is used all 
have a call to doVerb(). It is a bit of an educated guess.

>> I know that the SolarMutex issue is getting attention, and that area 
>> is far beyond my capabilities.
> I rather suspect that the SolarMutex is not the problem in itself; but the 
> non-Solar mutex :-) Solar Mutex locking is relatively tractable and 
> comprehensible. The problem mostly comes when people try to be too clever and 
> use another mutex: which is a recipe for deadlocks.

The more reason for me to try to keep away from mutexes. Which will 
have a catch, of course =)

Winfried

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Help/advise asked with deadlocks when opening Visio objects in Writer

2014-01-15 Thread Michael Meeks
Hi Winfried,

On Wed, 2014-01-08 at 12:00 +0100, Winfried Donkers wrote:
> Background information:
> The company I work for uses MS Visio to create illustrations, which are
> embedded into Writer documents (and not saved separately as Visio document). 

Right =) hopefully this is getting a bit better these days with the
Visio & EMF+ fixes we've been doing for 4.2.

> Since we started using LibreOffice versions later than 3.5 (I think), we
> have seemingly random problems with LibreOffice freezing when opening an
> embedded Visio object.

Ok - do you have a stack-trace ? if you run 4.2 of course you can get
debugging symbols for that which would make the problem debuggable at
least - a deadlock is a beast that gives a nice stack trace (if you get
all threads).

> The problem is very hard to reproduce (I have been trying for months, and
> have succeeded only once*), but it can occur multiple times on a single
> day for a single user. 

If we can catch it there and get a good bug filed we can perhaps do
something about it.

> The only way out is to kill LibreOffice (or Visio if you're lucky) with
> loss of recent changes as result. For my colleagues it is an extremely
> annoying problem and it feeds strong anti-LibreOffice feelings.

Sounds like it would =)

> Today, I had a breakthrough: a colleague reported that he received an 
> error message, "lgemene OLE fout". Normally, we don't get that, LibreOffice
> just freezes.
> This string and opengrok led to ERRCODE_SO_GENERALERROR  belonging to the 
> string,
> /core/sfx2/source/view/ipclient.cxx, which is the only file where 
> ERRCODE_SO_GENERALERROR is used.

How do you get from there to here:

> This led to a TODO-comment in 
> /core/embeddedobj/source/commonembedding/embobj.cxx, 
> OCommonEmbeddedObject::doVerb( ... ):
>   " TODO: a gross hack to avoid deadlocks [...] "

Of course - that code is a shambols - as is most of our 'threading'
code which is mostly based on superstition and a feeling of safety bread
from races not happening that often in the real world =)

> I know that the SolarMutex issue is getting attention, and that area is
> far beyond my capabilities.

I rather suspect that the SolarMutex is not the problem in itself; but
the non-Solar mutex :-) Solar Mutex locking is relatively tractable and
comprehensible. The problem mostly comes when people try to be too
clever and use another mutex: which is a recipe for deadlocks.

> But could there be a way to recognize these deadlocks and kill these
> deadlocks without killing the LibreOffice application ?

Assuming that we can write a quick mutex (and bear in mind that we
pointlessly take bazillions of these and release them again - just for
the sheer joy of it ;-) often on each method call) that can detect a
deadlock. Then what do we do ? :-) of course, we could try to break the
lock, steal the mutex from the other thread, and let one thread run on
and hand the mutex back later when it was unlocked ;-) but - it seems
like an horrific way to try to deal with the underlying issue.

> Possibly, with help from the experts, I might be able to create a
> temporary 'patch' ...

> I have not created a bug report for this, since I could find no way
> to reproduce the problem. Depending on your reaction(s) I will
> create the bug report.

As Michael says - if we get a trace [ of all threads! ] we should have
something to go on to fix this and (often) we can fix the problem easily
enough - often as a side-benefit introducing another threading hazard
elsewhere ;->

> *I ran version 4.1.4 and 4.2.0 concurrently, opened a Visio object in
> one of the two and both froze. Killing the one with the Visio object,
> made the other accessible again.

Interesting; well if you can reproduce this that's most interesting of
course.

Thanks for persisting ! looking forward to the trace,

ATB,

Michael.

-- 
 michael.me...@collabora.com  <><, Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


RE: Help/advise asked with deadlocks when opening Visio objects in Writer

2014-01-08 Thread Winfried Donkers
Hi Michael,

>it's usually possible to fix a deadlock from just a backtrace of all threads.

>you say you have a user who has been blessed by the gods with an ability to 
>reproduce the problem, so give them an LO with debug symbols (perhaps use 
>Kendy's fancy symbol-server thing or build it yourself), and once it's 
>properly locked up attach Visual Studio (or windbg) and copy all the stacks 
>and file a bug (CC: me).

I will set up and we'll see if the user is really blessed.
But as small changes (like trying again one minute later) can make that 
reproducing is not possible, it may be a long time before we're able to really 
catch a backtrace of the deadlock.

Winfried
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Help/advise asked with deadlocks when opening Visio objects in Writer

2014-01-08 Thread Michael Stahl
On 08/01/14 12:00, Winfried Donkers wrote:

> The company I work for uses MS Visio to create illustrations, which are 
> embedded into Writer documents (and not saved separately as Visio document). 
> Since we started using LibreOffice versions later than 3.5 (I think), we have 
> seemingly random problems with LibreOffice freezing when opening an embedded 
> Visio object.
> The problem is very hard to reproduce (I have been trying for months, and 
> have succeeded only once*), but it can occur multiple times on a single day 
> for a single user. 

> I know that the SolarMutex issue is getting attention, and that area is far 
> beyond my capabilities.
> But could there be a way to recognize these deadlocks and kill these 
> deadlocks without killing the LibreOffice application?

not really.

> Possibly, with help from the experts, I might be able to create a temporary 
> 'patch' ...
> I have not created a bug report for this, since I could find no way to 
> reproduce the problem. Depending on your reaction(s) I will create the bug 
> report.

it's usually possible to fix a deadlock from just a backtrace of all
threads.

you say you have a user who has been blessed by the gods with an ability
to reproduce the problem, so give them an LO with debug symbols (perhaps
use Kendy's fancy symbol-server thing or build it yourself), and once
it's properly locked up attach Visual Studio (or windbg) and copy all
the stacks and file a bug (CC: me).


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice