[Kernel-packages] [Bug 1650062] Re: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container
** Tags removed: severity-critical ** Tags added: severity-high -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1650062 Title: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Vinutha GS - 2016-12-13 02:47:35 == When some of the base and io tests were run inside a docker container, the par crashed and below are the stack trace and other details. Steps to re-create - 1. Install 16.04.02 on a PowerVM lpar. 2. Ran setup general. 3. Ran docker scripts[home grown scripts] which does docker package installation and other setups required to run STAF cases inside docker container. 4. We have docker image using which we launch containers and start tests inside containers. If complete details are required on how to execute scripts, please let me know. 5. STAF Base and IO tests were started inside containers successfully, after sometime, I see partition is in XMON. Docker info - docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 1.12.1 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 0 Dirperm1 Supported: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: null host bridge overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: apparmor Kernel Version: 4.4.0-53-generic Operating System: Ubuntu 16.04.1 LTS OSType: linux Architecture: ppc64le CPUs: 24 Total Memory: 49.89 GiB Name: bamlp3 ID: I7VI:G4RJ:RHTQ:WNGV:52FK:K7AZ:YDJQ:KFUM:P3UA:MZ3I:5XUY:WV3N Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ WARNING: No swap limit support Insecure Registries: 127.0.0.0/8 docker ps -a CONTAINER IDIMAGE COMMAND CREATED STATUS PORTS NAMES 61f2b8ab0a8632d545c3ea01"/bin/sh -c ./staf_io" 24 minutes ago Up 24 minutes bamlp3-io 151da0322172590e44f15214"/bin/sh -c ./staf_ba" 30 minutes ago Up 30 minutes bamlp3-base Stack trace - 8:mon> t [c00a5e147d10] da04ca98 aufs_flush_nondir+0x38/0x50 [aufs] [c00a5e147d40] c02e0428 filp_close+0x68/0xe0 [c00a5e147dc0] c030f71c __close_fd+0xcc/0x150 [c00a5e147e00] c02e04d4 SyS_close+0x34/0x90 [c00a5e147e30] c0009204 system_call+0x38/0xb4 --- Exception: c00 (System Call) at 3fff8bc217d8 SP (3fffd85203b0) is in userspace 8:mon> e cpu 0x8: Vector: 300 (Data Access) at [c00a5e147a40] pc: da04bdd4: au_do_flush+0x44/0x220 [aufs] lr: da04ca98: aufs_flush_nondir+0x38/0x50 [aufs] sp: c00a5e147cc0 msr: 80009033 dar: 28 dsisr: 4000 current = 0xc00a8b7fc8e0 paca= 0xcfb44c00 softe: 0irq_happened: 0x01 pid = 11936, comm = remap_file_page 8:mon> Release details - uname -r 4.4.0-53-generic uname -a Linux bamlp4 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:36 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux == Comment: #6 - Vinutha GS - 2016-12-14 03:16:18 == Please find the attached sosreport. Also i have followed the steps for k-dump, It is enabled now. I'm going to start the tests once again. == Comment: #12 - Kevin W. Rudd - 2016-12-14 16:06:46 == The basic reason for the panic is that close was called on a file that was no longer valid. The f_count value was -8 for some reason, so it passed the following check in filep_close(): if (!file_count(filp)) { printk(KERN_ERR "VFS: Close: file count is 0\n"); return 0; } It then blew up in au_do_flush() because f_inode was NULL. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1650062] Re: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container
** Changed in: linux (Ubuntu) Assignee: Taco Screen team (taco-screen-team) => Seth Forshee (sforshee) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1650062 Title: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Vinutha GS - 2016-12-13 02:47:35 == When some of the base and io tests were run inside a docker container, the par crashed and below are the stack trace and other details. Steps to re-create - 1. Install 16.04.02 on a PowerVM lpar. 2. Ran setup general. 3. Ran docker scripts[home grown scripts] which does docker package installation and other setups required to run STAF cases inside docker container. 4. We have docker image using which we launch containers and start tests inside containers. If complete details are required on how to execute scripts, please let me know. 5. STAF Base and IO tests were started inside containers successfully, after sometime, I see partition is in XMON. Docker info - docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 1.12.1 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 0 Dirperm1 Supported: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: null host bridge overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: apparmor Kernel Version: 4.4.0-53-generic Operating System: Ubuntu 16.04.1 LTS OSType: linux Architecture: ppc64le CPUs: 24 Total Memory: 49.89 GiB Name: bamlp3 ID: I7VI:G4RJ:RHTQ:WNGV:52FK:K7AZ:YDJQ:KFUM:P3UA:MZ3I:5XUY:WV3N Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ WARNING: No swap limit support Insecure Registries: 127.0.0.0/8 docker ps -a CONTAINER IDIMAGE COMMAND CREATED STATUS PORTS NAMES 61f2b8ab0a8632d545c3ea01"/bin/sh -c ./staf_io" 24 minutes ago Up 24 minutes bamlp3-io 151da0322172590e44f15214"/bin/sh -c ./staf_ba" 30 minutes ago Up 30 minutes bamlp3-base Stack trace - 8:mon> t [c00a5e147d10] da04ca98 aufs_flush_nondir+0x38/0x50 [aufs] [c00a5e147d40] c02e0428 filp_close+0x68/0xe0 [c00a5e147dc0] c030f71c __close_fd+0xcc/0x150 [c00a5e147e00] c02e04d4 SyS_close+0x34/0x90 [c00a5e147e30] c0009204 system_call+0x38/0xb4 --- Exception: c00 (System Call) at 3fff8bc217d8 SP (3fffd85203b0) is in userspace 8:mon> e cpu 0x8: Vector: 300 (Data Access) at [c00a5e147a40] pc: da04bdd4: au_do_flush+0x44/0x220 [aufs] lr: da04ca98: aufs_flush_nondir+0x38/0x50 [aufs] sp: c00a5e147cc0 msr: 80009033 dar: 28 dsisr: 4000 current = 0xc00a8b7fc8e0 paca= 0xcfb44c00 softe: 0irq_happened: 0x01 pid = 11936, comm = remap_file_page 8:mon> Release details - uname -r 4.4.0-53-generic uname -a Linux bamlp4 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:36 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux == Comment: #6 - Vinutha GS - 2016-12-14 03:16:18 == Please find the attached sosreport. Also i have followed the steps for k-dump, It is enabled now. I'm going to start the tests once again. == Comment: #12 - Kevin W. Rudd - 2016-12-14 16:06:46 == The basic reason for the panic is that close was called on a file that was no longer valid. The f_count value was -8 for some reason, so it passed the following check in filep_close(): if (!file_count(filp)) { printk(KERN_ERR "VFS: Close: file count is 0\n"); return 0; } It then blew up in au_do_flush() because f_inode was NULL. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1650062] Re: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container
I haven't found any obvious source in aufs for the extra fputs which I suspect are causing this problem. If you could either give me more information so I can run the same tests myself (I'm guessing the problem isn't arch-specific) or else reproduce the problem with the kernel patched with something like the following, perhaps we can catch it in the act. However we also might just catch legitimate fputs since the erroneous ones could occur while the refcount is still positive ... diff --git a/fs/file_table.c b/fs/file_table.c index df66450fb443..d4911a6e8331 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -264,7 +264,9 @@ static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); void fput(struct file *file) { - if (atomic_long_dec_and_test(>f_count)) { + long cnt = atomic_long_dec_return(>f_count); + WARN_ON(cnt < 0); + if (cnt == 0) { struct task_struct *task = current; if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1650062 Title: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Vinutha GS - 2016-12-13 02:47:35 == When some of the base and io tests were run inside a docker container, the par crashed and below are the stack trace and other details. Steps to re-create - 1. Install 16.04.02 on a PowerVM lpar. 2. Ran setup general. 3. Ran docker scripts[home grown scripts] which does docker package installation and other setups required to run STAF cases inside docker container. 4. We have docker image using which we launch containers and start tests inside containers. If complete details are required on how to execute scripts, please let me know. 5. STAF Base and IO tests were started inside containers successfully, after sometime, I see partition is in XMON. Docker info - docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 1.12.1 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 0 Dirperm1 Supported: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: null host bridge overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: apparmor Kernel Version: 4.4.0-53-generic Operating System: Ubuntu 16.04.1 LTS OSType: linux Architecture: ppc64le CPUs: 24 Total Memory: 49.89 GiB Name: bamlp3 ID: I7VI:G4RJ:RHTQ:WNGV:52FK:K7AZ:YDJQ:KFUM:P3UA:MZ3I:5XUY:WV3N Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ WARNING: No swap limit support Insecure Registries: 127.0.0.0/8 docker ps -a CONTAINER IDIMAGE COMMAND CREATED STATUS PORTS NAMES 61f2b8ab0a8632d545c3ea01"/bin/sh -c ./staf_io" 24 minutes ago Up 24 minutes bamlp3-io 151da0322172590e44f15214"/bin/sh -c ./staf_ba" 30 minutes ago Up 30 minutes bamlp3-base Stack trace - 8:mon> t [c00a5e147d10] da04ca98 aufs_flush_nondir+0x38/0x50 [aufs] [c00a5e147d40] c02e0428 filp_close+0x68/0xe0 [c00a5e147dc0] c030f71c __close_fd+0xcc/0x150 [c00a5e147e00] c02e04d4 SyS_close+0x34/0x90 [c00a5e147e30] c0009204 system_call+0x38/0xb4 --- Exception: c00 (System Call) at 3fff8bc217d8 SP (3fffd85203b0) is in userspace 8:mon> e cpu 0x8: Vector: 300 (Data Access) at [c00a5e147a40] pc: da04bdd4: au_do_flush+0x44/0x220 [aufs] lr: da04ca98: aufs_flush_nondir+0x38/0x50 [aufs] sp: c00a5e147cc0 msr: 80009033 dar: 28 dsisr: 4000 current = 0xc00a8b7fc8e0 paca= 0xcfb44c00 softe: 0irq_happened: 0x01 pid = 11936, comm = remap_file_page 8:mon> Release details - uname -r 4.4.0-53-generic uname -a Linux bamlp4 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:36 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux == Comment: #6 - Vinutha GS - 2016-12-14 03:16:18 == Please find the attached sosreport. Also i have followed the steps for k-dump, It is enabled now. I'm going to start the tests once again. == Comment: #12 - Kevin W. Rudd - 2016-12-14 16:06:46 == The basic reason for the panic is that close was called on a file that was no longer valid. The f_count value was -8 for some reason, so it passed the following check in filep_close(): if (!file_count(filp)) { printk(KERN_ERR "VFS: Close: file count
[Kernel-packages] [Bug 1650062] Re: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container
>From the attachment file->f_inode is NULL. That shouldn't be the case for a file passed to the aufs flush callback, so it's not a bug for aufs to assume it will be non-NULL. However in this case the file has a negative reference count, so there must be some earlier problem with reference counting. I'm not all that familiar with aufs, so I'll have to spend some time looking for unbalanced fputs in aufs. It's possible hoewever that the problem lies outside of aufs. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1650062 Title: Ubuntu16.04.01VM:Docker-Powervm aufs bad file panic while running tests in a docker container Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Vinutha GS - 2016-12-13 02:47:35 == When some of the base and io tests were run inside a docker container, the par crashed and below are the stack trace and other details. Steps to re-create - 1. Install 16.04.02 on a PowerVM lpar. 2. Ran setup general. 3. Ran docker scripts[home grown scripts] which does docker package installation and other setups required to run STAF cases inside docker container. 4. We have docker image using which we launch containers and start tests inside containers. If complete details are required on how to execute scripts, please let me know. 5. STAF Base and IO tests were started inside containers successfully, after sometime, I see partition is in XMON. Docker info - docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 1.12.1 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 0 Dirperm1 Supported: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: null host bridge overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: apparmor Kernel Version: 4.4.0-53-generic Operating System: Ubuntu 16.04.1 LTS OSType: linux Architecture: ppc64le CPUs: 24 Total Memory: 49.89 GiB Name: bamlp3 ID: I7VI:G4RJ:RHTQ:WNGV:52FK:K7AZ:YDJQ:KFUM:P3UA:MZ3I:5XUY:WV3N Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ WARNING: No swap limit support Insecure Registries: 127.0.0.0/8 docker ps -a CONTAINER IDIMAGE COMMAND CREATED STATUS PORTS NAMES 61f2b8ab0a8632d545c3ea01"/bin/sh -c ./staf_io" 24 minutes ago Up 24 minutes bamlp3-io 151da0322172590e44f15214"/bin/sh -c ./staf_ba" 30 minutes ago Up 30 minutes bamlp3-base Stack trace - 8:mon> t [c00a5e147d10] da04ca98 aufs_flush_nondir+0x38/0x50 [aufs] [c00a5e147d40] c02e0428 filp_close+0x68/0xe0 [c00a5e147dc0] c030f71c __close_fd+0xcc/0x150 [c00a5e147e00] c02e04d4 SyS_close+0x34/0x90 [c00a5e147e30] c0009204 system_call+0x38/0xb4 --- Exception: c00 (System Call) at 3fff8bc217d8 SP (3fffd85203b0) is in userspace 8:mon> e cpu 0x8: Vector: 300 (Data Access) at [c00a5e147a40] pc: da04bdd4: au_do_flush+0x44/0x220 [aufs] lr: da04ca98: aufs_flush_nondir+0x38/0x50 [aufs] sp: c00a5e147cc0 msr: 80009033 dar: 28 dsisr: 4000 current = 0xc00a8b7fc8e0 paca= 0xcfb44c00 softe: 0irq_happened: 0x01 pid = 11936, comm = remap_file_page 8:mon> Release details - uname -r 4.4.0-53-generic uname -a Linux bamlp4 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:36 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux == Comment: #6 - Vinutha GS - 2016-12-14 03:16:18 == Please find the attached sosreport. Also i have followed the steps for k-dump, It is enabled now. I'm going to start the tests once again. == Comment: #12 - Kevin W. Rudd - 2016-12-14 16:06:46 == The basic reason for the panic is that close was called on a file that was no longer valid. The f_count value was -8 for some reason, so it passed the following check in filep_close(): if (!file_count(filp)) { printk(KERN_ERR "VFS: Close: file count is 0\n"); return 0; } It then blew up in au_do_flush() because f_inode was NULL. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp