On Tue, 2007-07-03 at 11:09 +0200, Johan Borkhuis wrote: > Philippe, > > Again, thank you for very extensive answer. It is starting to get > clearer what is happening here but I still have some questions. > > The first one is probably an easy one: what do you mean by COW? I did > try to find what this means, but I was only able to find references to > farm animals.....
Ok, that's a good start. > (and some references to lazy linking etc. but there > COW was also not explained). Search for "copy-on-write". > > Philippe Gerum wrote: > > On Mon, 2007-07-02 at 16:18 +0200, Johan Borkhuis wrote: > > > >> Philippe, > >> > >> Philippe Gerum wrote: > >> > >>> Late binding to functions performed on behalf of the dynamic loader > >>> against shared libraries shall need the kernel during symbol resolution > >>> (internal syscalls) or execution (e.g. demand loading, COW), hence the > >>> switch. Unfortunately, the I-pipe patch for PPC does not support > >>> disabling all on-demand memory mappings for selected Linux tasks (only > >>> the x86 and ARM patches support this feature so far). > >>> > >>> > >> Thank you for you answer. > >> > >> Just for me to make sure I understand this correctly: > >> We are not using shared libraries for our application, our applications > >> are linked against .a files, which are included in the final application > >> > > > > In such a case, you have likely hit an illustration of the latter issue > > which the I-pipe/ppc implementation still suffers from: some page table > > entries are missed during real-time operations. As a consequence of > > this, the nucleus catches page faults on behalf of RT threads in primary > > mode, then switches these threads back to secondary in order to process > > the faults, and eventually wire the missing PTEs in. This is something > > calling mlockall() does not prevent the application from (like COW). > > > As some PTE's are missed, does this mean that not the complete program > was loaded into memory? > > What I understand until now about this process is the following: > The program is executed. Not everything is loaded into memory by the > dynamic loader. As the functions that are not in memory are accessed a > page fault is created, and the page is made available. (Or is the page > already in memory, but not made available to the application?) > Is my assumption correct that once a page is accessed mlockall will take > care that this page stays active, or is it possible that the page is > moved out, and that another page fault occurs for the same page? > > > ..... > > > > That is expected. If you switch the nucleus debug option on, you should > > see Xenomai whining about secondary mode switches from code locations in > > kernel space. This would confirm the fact that you have been hitting > > this problem. > > > > When looking at the nucleus debug output, I see a number of switches > coming from user space, like this one: > > Jul 3 06:51:10 MVME3100-198 kernel: Xenomai: Switching testTask to > secondary mode after exception #1025 from user-space at 0x10005f2c (pid > 1069) > > I did try to find what exception 1025 is, but I could not find this > reference. I expect indeed that this is a page-fault, but I am not sure. > The location 0x10005f2c is the start of a function in one of the > statically linked objects from one of our archives. You are referring to > kernel space, but these exceptions are generated from user space. Is > this different from what you are referring to? > > There is not much to be done except improving the I-pipe/ppc support so > > that it provides a way to pin down any PTE an application might refer > > to. There might be other related issues beyond this one though. > > Fortunately, mode transitions for dealing for such faults normally don't > > lead to bad latencies on this arch. Do you confirm that, or are you > > unlucky regarding this? > > > > As this would normally happen only on startup it is not such a big deal. > After the first couple of cycles this should not occur anymore, so > operational RT performance is not compromised. > The only problem is that the application switches to secondary mode, and > that I have to switch it back to primary mode manually (or by doing a > Xenomai system call). Is there a possibility to automatically switch > back to primary mode when such a fault occurs? > > I would expect that this problem was popping up on other ppc platforms > as well, but I did not find references to this. Is this something that > specific to this architecture, this platform, or are other people just > ignoring this problem? > > And now for some personal thoughts. > To be honest, I am a bit surprised by this problem (but that is caused > by my lack of knowledge in this area). > I am running this on a system without swap and with CONFIG_SWAP disabled > in the kernel. My applications are linked statically (ldd gives: "not a > dynamic executable"), so I would expect that everything is available in > memory. But it looks like there are some things going on below the > surface that I am not aware of (but that I am curious to find out). > > Kind regards, > Johan Borkhuis -- Philippe. _______________________________________________ Xenomai-help mailing list [email protected] https://mail.gna.org/listinfo/xenomai-help
