On 20/01/16 22:01, sf...@users.sourceforge.net wrote:
> OmegaPhil:
>> It has now been some time since I got the kernel memory allocation
>> failures, so clearly the libau hack has fixed it - thanks.
> 
> Glad to hear that!
> (Honestly speaking, I totally forgot about this issue)
> 
> 
>> In the manpage, please can you change 'If you have a directory which has
>> millions of files' to say 'tens of thousands of files', and it would be
>> useful to mention 'page allocation failure' somehow so that its easy for
>       :::
> 
> How about the attached diff?

The diff looks good, however for normal users it might be useful to
force them to think 'syslog', since normal programs will probably throw
a useless generic 'I/O error':

'You may meet "out of memory" message or "page allocation failure" due
to the memory fragmentation or real starvation'

V

'A program using the directory may throw an "out of memory" error and/or
the kernel may output a "page allocation failure" associated with the
program in the syslog, due to memory fragmentation or real starvation'


>> rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22)=
> 
> Hmm, won't you investigate it a little more?
> - which systemcall returned EINVAL(22)?
> - what parameter did rsync pass to the systemcall (or readdir)?
> 
> And is your $LIBAU set to "all"?

I did look into it on the rsync side, didn't look useful - see
https://download.samba.org/pub/unpacked/rsync/flist.c:send_directory,
the readdir is called on line 1739, with the error reported on 1771.

Suddenly the VM doesn't error anymore in the particular test I set up,
so back on the server, I fiddled with the rsync init.d script and ran
the daemon via 'strace -fv'. One EINVAL hit in the resulting file, here
is it with some context:

=======================================================================

[pid  1293] stat("/omega1-home/", {st_dev=makedev(0, 34), st_ino=273972,
st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096,
st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12,
st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0
[pid  1293] chdir("/omega1-home/")      = 0
[pid  1293] socketpair(PF_LOCAL, SOCK_STREAM, 0, [4, 6]) = 0
[pid  1293] fcntl(4, F_GETFL)           = 0x2 (flags O_RDWR)
[pid  1293] fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  1293] fcntl(6, F_GETFL)           = 0x2 (flags O_RDWR)
[pid  1293] fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  1293] clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f7b499149d0) = 1294
Process 1294 attached
[pid  1293] close(6 <unfinished ...>
[pid  1294] set_robust_list(0x7f7b499149e0, 24 <unfinished ...>
[pid  1293] <... close resumed> )       = 0
[pid  1294] <... set_robust_list resumed> ) = 0
[pid  1293] lstat(".",  <unfinished ...>
[pid  1294] close(4 <unfinished ...>
[pid  1293] <... lstat resumed> {st_dev=makedev(0, 34), st_ino=273972,
st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096,
st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12,
st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0
[pid  1294] <... close resumed> )       = 0
[pid  1293] openat(AT_FDCWD, ".",
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC <unfinished ...>
[pid  1294] select(6, [5], [], [5], {60, 0} <unfinished ...>
[pid  1293] <... openat resumed> )      = 6
[pid  1293] brk(0x564d6bf68000)         = 0x564d6bf68000
[pid  1293] fstatfs(6, {f_type=0x61756673, f_bsize=4096,
f_blocks=3418641366, f_bfree=652846041, f_bavail=649635649, f_files=0,
f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0
[pid  1293] ioctl(6, _IOC(_IOC_READ|_IOC_WRITE, 0x41, 0x00, 0x40),
0x7ffc3f394810) = -1 EINVAL (Invalid argument)
[pid  1293] sendto(3, "<28>Jan 21 19:11:31 rsyncd[1293]"..., 103,
MSG_NOSIGNAL, NULL, 0) = 103
[pid  1293] fstatfs(6, {f_type=0x61756673, f_bsize=4096,
f_blocks=3418641366, f_bfree=652846041, f_bavail=649635649, f_files=0,
f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0
[pid  1293] futex(0x7f7b48b3d0a8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  1293] close(6)                    = 0
[pid  1293] lstat(".", {st_dev=makedev(0, 34), st_ino=273972,
st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096,
st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12,
st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0

=======================================================================

After lstating '.', rsync appears to go on and lstat the subdirectories.
I'm guessing that due to the failure being an ioctl call, it didn't
appear in the usual '-e trace=file' invocation?


>> This appears to have happened after I upgraded the kernel to v4.3.3-5,
> 
> Is this version debian kernel pkg's?
> According to your post in last year, your system is
>       4.2.0-1-amd64 #1 SMP Debian 4.2.5-1 (2015-10-27) x86_64
>       GNU/Linux - Debian Testing standard kernel.
> 
> If this problem is specific to debian v4.3.3-5 kernel, then I will try
> finding the changes made in
> 1. vanilla v4.3.3
> 2. debian v4.3.3-5
> particulary around ioctl(2).

Just confirmed, on this kernel the setup is fine:

=======================================================================

Linux 4.2.0-1-amd64 #1 SMP Debian 4.2.6-1 (2015-11-10) x86_64 GNU/Linux

=======================================================================

On this it breaks:

=======================================================================

Linux 4.3.0-1-amd64 #1 SMP Debian 4.3.3-5 (2016-01-04) x86_64 GNU/Linux

=======================================================================

Yes these are stock Debian kernels - the only special compilation I do
is your standalone aufs driver (there are some DKMS modules mind).

Thanks

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140

Reply via email to