On 21-1-2013 22:20, Michael Haberler wrote: > Am 21.01.2013 um 20:10 schrieb Gilles Chanteperdrix: > >> On 01/21/2013 02:32 PM, Michael Haberler wrote: >> >>> Am 21.01.2013 um 12:56 schrieb Gilles Chanteperdrix: > >>>>> question: does a RTC time warp have any possible bearing on >>>>> Xenomai operations? >>>> >>>> No, it should not, Xenomai uses its own clock, which is set only >>>> once upon boot, so, is unaffected by Linux wallclock time >>>> changes... or should be. >>> >>> it might not be Xenomai after all. Uhum. >>> >>> the bughunt safari tribe has decided to focus on class 'duh' problems >>> and resolves to shut up until red hands are spotted. >> >> I would still put the check in the timer "set_next_event" callback, just >> in case... > I assume Bas will give the postmortem shortly - he nailed the issue; the RTC > boot timewarp makes for a lost DHCP lease midflight and NFS freezing, making > it look like a kernel hang. > > relieved, > > - Michael
Michael said it all, there's not much for me to add. I'll summarize the case for the records ; ) Lesson learned: Change only one variable at a time and don't assume anything! I had been using a NFS mounted filesystem with the Beaglebone for over a year now without problems and got used to it's reliability (as I was used to in a corporate environment in the past). Because the Xenomai software was built with libraries (eabihf) not compatible with my (eabi) system I switched to the Ubuntu image Michael built, and everything seemed to work fine. Except that the (xenomai) kernel froze out after around 50-60 minutes of uptime. With the JTAG debugger I could see the kernel still running, but all applications (both text and X via SSH, and console via serial/USB connection) seemed frozen and there was no output indicating what was going on. Of course the xenomai kernel was the first suspect. But that proved to be a mistake. With hindsight, knowing the cause of the freeze now, I wonder why I haven't gotten the NFS connection time-out message on the console, but for some reason or another that isn't generated in this case. The underlying problem is that the Beaglebone has no battery backed real-time clock. This gives (only) a serious problem (freeze) with (1) a network mounted NFS root filesystem and (2) an initial kernel time lying in the past and (3) a DHCP lease time shorter than some multiple (in this case 2x) of the required system uptime. Ubuntu (and maybe Debian too) systems are obviously not designed to start with a completely wrong real-time clock value. And the dhclient (as many other programs) is not designed to handle the large time step that's generated once the clock is set properly sometime during the boot process. Note that if the filesystem is on local storage (e.g. FLASH or harddisk), there will only be a short disruption of the network connection and it's likely that the problem won't be noticed at all. A final solution hasn't been found yet: I prefer a workaround without changing the dhclient or some other standard program. I think it would suffice to acquire a new lease right after the time-step has been made. This has to be done without giving up the previous lease (that has expired because of the time-step), because that would cause the system to freeze again. Suggestions on how to do this are welcome. I can't spend much more time on this issue this week. -- Bas > > > > > > > _______________________________________________ > Xenomai mailing list > [email protected] > http://www.xenomai.org/mailman/listinfo/xenomai ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d _______________________________________________ Emc-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/emc-developers
