Re: [directfb-dev] fbdev: problems with suspend/resume

Denis Oliver Kropp Mon, 19 Feb 2007 14:45:13 -0800

Vaclav Slavik schrieb:
> Hi,
> 
> I'm experiencing crashes in a DFB application (running on an ARM Linux 
> 2.6 machine, with fbdev) when the OS suspends and then resumes. Do 
> you have any idea what could be causing this and how to fix it? As 
> you'll see, we couldn't get very far debugging it...
> 
> What happens: after suspending the machine
> ("echo mem >/sys/power/state") and then resuming it, the DFB 
> application is gone, with the following log messages (this was 
> without debug mode enabled):
> 
> (!) [ 1076:   19.429] --> Caught signal 4 (at 0xbe826a2c, illegal 
> trap) <--
> (!) FUSION_PROPERTY_LEASE    --> Connection timed out
> (!) [ 1076:   19.521] --> Caught signal 11 (at (nil), invalid address) 
> <--
> Killed


Do you also have kernel messages regarding this?

> We initially suspected it has something to do with the clock being 
> bumped at wake-up and fusion module interpreting this as a time-out 
> and causing fusion_property_lease() to fail. I still think this could 
> be a problem, seeing that there's no suspend/resume related code in 
> linux-fusion, but it's more complicated than that, unfortunately:

Yes, clock skews are not really handled :(

I'd like to have a Fusion Time which is implemented in the kernel
module and does the necessary (platform specific?) things to provide a
skewless runtime counter, maybe in the shared area to avoid system
calls.

What other solutions are out there?

> Notice the SIGILL (=4) signal above. This happens *before* the system 
> is suspended, unfortunately without any useful backtrace:
> 
> (gdb) c
> Continuing. Breakpoint 2 at 0x4067f150: 
> file /home/zeitlin/src/rea/src/DirectFB/src/core/core.c, line 633. 
> Pending breakpoint "dfb_core_suspend" resolved  Program terminated 
> with signal SIGKILL, Killed. The program no longer exists. 
> 
> Also, there's the following output in debug build when suspending:
> 
> (!) [VT Switcher       9.060] ( 1359) *** Assumption [core != NULL] 
> failed *** [/home/zeitlin/src/rea/src/DirectFB/src/core/core.c:633 in 
> dfb_core_suspend()]

The assumption is ok. No harm.

But why does it switch VTs?

> (!) [ 1338:   48.076] --> Caught signal 4 (at 0xbede4954, illegal 
> trap) <-- 
> (!) [Main Thread      48.081] ( 1338) *** Assertion [(thread)->magic 
> == D_MAGIC("DirectThread")] failed *** 
> [/home/zeitlin/src/rea/src/DirectFB/lib/direct/thread.c:294 in 
> direct_thread_cancel()] 
> 
> The assumption fails because vt_thread() calls the function with NULL 
> core, so there's clearly a bug here, but I wonder if the bug is 
> having the check or passing NULL from vt_thread()... 

The check is there for the ongoing migration from global variables.
In the end it should be possible to run several DirectFB cores in
one process. Imagine DirectFB on DirectFB with emulation for all
different hardware profiles, offering the same layers etc. as the
target platform, but implemented using Blit() for example.

> Unfortunately, using no-vt-switching doesn't help much: we don't get 
> that assumption failure above, but the app still receives SIGILL (on 
> resume this time), with useless gdb backtrace.

Please build with "--enable-trace" that maintains a copy of the
call stack and therefore even helps if you have stack corruption.

-- 
Best regards,
   Denis Oliver Kropp

.------------------------------------------.
| DirectFB - Hardware accelerated graphics |
| http://www.directfb.org/                 |
"------------------------------------------"

_______________________________________________
directfb-dev mailing list
[email protected]
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Re: [directfb-dev] fbdev: problems with suspend/resume

Reply via email to