Test on Trusty
Before:
$ uname -a
Linux bionic 3.13.0-155-generic #205-Ubuntu SMP Fri Aug 10 15:53:26 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux
$ echo 9 | sudo tee /proc/sys/kernel/printk
9
$ sudo losetup --find --show --partscan rlv_grkgld.1mb
<hung>
[ 270.506420] partition (null) (3 pp's found) is not contiguous
[ 270.510221] partition (null) (1 pp's found) is not contiguous
[ 270.513952] partition (null) (68 pp's found) is not contiguous
...
[ 270.593589] partition (null) (3 pp's found) is not contiguous
[ 270.595603] partition (null) (2 pp's found) is not contiguous
[ 270.597428] BUG: unable to handle kernel paging request at 0000000000001000
[ 270.599525] IP: [<ffffffff81379d4d>] strnlen+0xd/0x40
[ 270.601404] PGD 0
[ 270.601404] Oops: 0000 [#1] SMP
[ 270.601404] Modules linked in: squashfs isofs nls_iso8859_1 kvm_intel kvm
serio_raw sch_fq_codel iscsi
_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4
btrfs libcrc32c raid10 raid456
async_memcpy async_raid6_recov async_pq async_xor async_tx xor raid6_pq raid1
raid0 multipath linear psm
ouse floppy
[ 270.601404] CPU: 1 PID: 972 Comm: losetup Not tainted 3.13.0-155-generic
#205-Ubuntu
[ 270.601404] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.10.2-1ubuntu1 04/01/2014
[ 270.601404] task: ffff88003998e000 ti: ffff88003b1b6000 task.ti:
ffff88003b1b6000
[ 270.601404] RIP: 0010:[<ffffffff81379d4d>] [<ffffffff81379d4d>]
strnlen+0xd/0x40
[ 270.601404] RSP: 0018:ffff88003b1b7888 EFLAGS: 00010086
[ 270.601404] RAX: ffffffff81a674a1 RBX: ffffffff81ecbdec RCX: fffffffffffffffe
[ 270.601404] RDX: 0000000000001000 RSI: ffffffffffffffff RDI: 0000000000001000
[ 270.601404] RBP: ffff88003b1b7888 R08: 000000000000ffff R09: 000000000000ffff
[ 270.601404] R10: ffffffff813e27f0 R11: ffff88003b1b773e R12: 0000000000001000
[ 270.601404] R13: ffffffff81ecc1c0 R14: 00000000ffffffff R15: 0000000000000000
[ 270.601404] FS: 00007fbcbba18740(0000) GS:ffff88003ee80000(0000)
knlGS:0000000000000000
[ 270.601404] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 270.601404] CR2: 0000000000001000 CR3: 000000003a9fe000 CR4: 0000000000000670
[ 270.601404] Stack:
[ 270.601404] ffff88003b1b78c0 ffffffff8137c0ab ffffffff81ecbdec
ffffffff81ecc1c0
[ 270.601404] ffff88003b1b79c0 ffffffff81a939c6 ffffffff81a939c6
ffff88003b1b7928
[ 270.601404] ffffffff8137d521 0000000000000086 ffff88003b1b773e
000000000000000c
[ 270.601404] Call Trace:
[ 270.601404] [<ffffffff8137c0ab>] string.isra.5+0x3b/0xf0
[ 270.601404] [<ffffffff8137d521>] vsnprintf+0x1c1/0x610
[ 270.601404] [<ffffffff8137d97d>] vscnprintf+0xd/0x30
[ 270.601404] [<ffffffff810c4b91>] vprintk_emit+0x111/0x530
[ 270.601404] [<ffffffff8173313a>] printk+0x67/0x69
[ 270.601404] [<ffffffff8135a683>] aix_partition+0x613/0x620
[ 270.601404] [<ffffffff813768de>] ? radix_tree_lookup_slot+0xe/0x10
[ 270.601404] [<ffffffff8135e0a0>] msdos_partition+0x870/0x890
[ 270.601404] [<ffffffff81158581>] ? read_cache_page+0x21/0x30
[ 270.601404] [<ffffffff8135826d>] ? read_dev_sector+0x2d/0x90
[ 270.601404] [<ffffffff8137da39>] ? snprintf+0x39/0x40
[ 270.601404] [<ffffffff8135d830>] ? parse_solaris_x86+0x230/0x230
[ 270.601404] [<ffffffff81358eaa>] check_partition+0x10a/0x240
[ 270.601404] [<ffffffff81358ad7>] rescan_partitions+0xb7/0x2c0
[ 270.601404] [<ffffffff813540ef>] blkdev_ioctl+0xef/0x7d0
[ 270.601404] [<ffffffff8173cfa9>] ? schedule_timeout+0x279/0x310
[ 270.601404] [<ffffffff812016f3>] ioctl_by_bdev+0x33/0x40
[ 270.601404] [<ffffffff814c765a>] loop_set_status+0x39a/0x3b0
[ 270.601404] [<ffffffff814c7930>] loop_set_status64+0x50/0x70
[ 270.601404] [<ffffffff814c9938>] lo_ioctl+0x1e8/0x730
[ 270.601404] [<ffffffff8135421f>] blkdev_ioctl+0x21f/0x7d0
[ 270.601404] [<ffffffff8174aa3c>] ? system_call_after_swapgs+0x156/0x170
[ 270.601404] [<ffffffff812016b1>] block_ioctl+0x41/0x50
[ 270.601404] [<ffffffff811da953>] do_vfs_ioctl+0x2e3/0x4d0
[ 270.601404] [<ffffffff8174a9fd>] ? system_call_after_swapgs+0x117/0x170
[ 270.601404] [<ffffffff8174a9f6>] ? system_call_after_swapgs+0x110/0x170
[ 270.601404] [<ffffffff8174a9ef>] ? system_call_after_swapgs+0x109/0x170
[ 270.601404] [<ffffffff8174a9e8>] ? system_call_after_swapgs+0x102/0x170
[ 270.601404] [<ffffffff8174a9e1>] ? system_call_after_swapgs+0xfb/0x170
[ 270.601404] [<ffffffff8174a9da>] ? system_call_after_swapgs+0xf4/0x170
[ 270.601404] [<ffffffff8174a9d3>] ? system_call_after_swapgs+0xed/0x170
[ 270.601404] [<ffffffff8174a9cc>] ? system_call_after_swapgs+0xe6/0x170
[ 270.601404] [<ffffffff811dabc1>] SyS_ioctl+0x81/0xa0
[ 270.601404] [<ffffffff8174a99b>] ? system_call_after_swapgs+0xb5/0x170
[ 270.601404] [<ffffffff8174aa70>] system_call_fastpath+0x1a/0x1f
[ 270.601404] Code: c0 01 80 38 00 75 f7 48 29 f8 5d c3 31 c0 5d c3 66 66 66
66 66 2e 0f 1f 84 00 00 00
00 00 55 48 85 f6 48 8d 4e ff 48 89 e5 74 2a <80> 3f 00 74 25 48 89 f8 31 d2 eb
10 0f 1f 80 00 00 00 00 48 83
[ 270.601404] RIP [<ffffffff81379d4d>] strnlen+0xd/0x40
[ 270.601404] RSP <ffff88003b1b7888>
[ 270.601404] CR2: 0000000000001000
[ 270.732715] ---[ end trace c22abe83af8ab594 ]---
After:
$ uname -a
Linux bionic 3.13.0-155-generic #205+sf181954.1 SMP Wed Aug 15 17:05:33 -03
2018 x86_64 x86_64 x86_64 GNU/Linux
$ echo 9 | sudo tee /proc/sys/kernel/printk
9
$ sudo losetup --find --show --partscan rlv_grkgld.1mb
/dev/loop0
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1787281
Title:
errors when scanning partition table of corrupted AIX disk
Status in linux package in Ubuntu:
Incomplete
Bug description:
[Impact]
* Users with disks/LUNs used for AIX operating system installations
previously, which possibly undergone overwrites/corruption on the
partition table, might hit kernel failures during partition scan
of such disk/LUN, and possibly hang the system (seen with retries).
* The Linux kernel should be robust to corrupted disk data, performing
a better sanitization/checks and not failing.
* The fix are a couple of simple logic changes to make the code
of the AIX partition table parser more robust.
[Test Case]
* Run the partition scan on the (trimmed) disk image of the AIX lun.
(It's not provided here since it contains customer data), with this
command:
$ sudo losetup --find --show --partscan rlv_grkgld.1mb
* On failure, the command hangs, and messages like these are printed
to the console, depending on the kernel version (see tests below)
[ 270.506420] partition (null) (3 pp's found) is not contiguous
[ 270.597428] BUG: unable to handle kernel paging request at
0000000000001000
[ 270.599525] IP: [<ffffffff81379d4d>] strnlen+0xd/0x40
* On success, the command prints a loop device name, for example:
/dev/loop0
[Regression Potential]
* Low. Both changes are simple improvements in logic.
* This affects users which mount disks/LUNs from the AIX OS;
it should only change behavior for users which relied on a
uninitialized variables to work correctly during partition
scan of those disks/LUNs which should be rare as the code
is likely to fail as we observe in this scenario.
* This has been tested on Cosmic, Bionic, Xenial, and Trusty.
[Other Info]
* Patches will be sent to the kernel-team mailing list.
Bug Description:
---------------
We've recently received a disk image from an AIX LUN that when
attached on Linux displayed errors on console, then eventually
hung the system (specially if the SCSI bus was re-scanned, and
leading to another partition scan).
Apparently the LUN was originally installed with AIX and later
exercised with some I/O stress/overwrites which caused certain
bits to be wrong in just the right way for Linux to get a NULL
pointer and invalid data.
This is the test-case used ('--partscan' is the important bit).
$ sudo losetup --show --find --partscan aix-lun.img
Since the original code is old, it affects several releases.
It's interesting to fix this on 14.04 and up, on which IBM
Power servers were initially supported (since they can run
AIX too, and possibly hit this due to an already used disk/LUN).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787281/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp