James B:
> The filesystem operations are indeed not heavy; I have a few cronjobs that
> performs various sanity checks by doing those mv/ls/cat/touch etc and they
> will run at the same frequency whether the CPU is loaded or not. When the CPU
> is not loaded the kernel can last longer (so far I have tested up to 2 days).
> I will test more.
Ok.
If you can, I'd ask you to try testing without aufs, repeat mv under
heavy decoding workload.
> Thanks, I didn't notice this before. I will activate this debug switch and
> hopefully I can supply you with more info.
> EDIT: It seems to generate huge amount of information, I'm not sure whether
> that will be useful for you.
In this case, this approach may be more effective.
- insert this just before every dput() in aufs_rename().
au_debug_on();
AuDbgDentry(d);
au_debug_off();
dput(d);
- note that au_debug_on() is equivalent to set 1 to the module parameter
"debug." So during in this short window unrelated debug messages from
other processes can be printed.
> Contents of /proc/mounts:
:::
I guess you mounted aufs in initramfs and did switch_root/chroot, right?
If so, did you "mount --move" the branches before switch_root?
> I am not sure myself, I would thought that would be the per-process stack
> size or the bottom of the the stack. FYI the kernel is configured for 2G/2G
> split (instead of 3G/1G).
What is the size of stack? 4K or 8K?
Now I begin thinking the problem may exist outside aufs. The reasons
are,
- the address in your log is 0000003f. this is really strange. even if
aufs_rename() passed NULL to dput(), it cannot cause any
problem. dput() simply returns immediately.
- generally the structure and its members are aligned. 0000003f should
not happen. but this is highly depending upon your machine and I am
not sure such alignment is valid.
- if aufs_rename() is totally crazy and passed 0x1 or something to dput,
then such message can happen (maybe). but in this case another problem
will appear earlier I guess.
Anyway putting AuDbgDentry() before dput() will detect the problem a
little earlier and we may be able to investigate more. Please try it.
This case, I cannot reproduce the problem on my side unfortunately.
J. R. Okajima
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds