Bug#982459: closing 982459

2023-07-03 Thread Salvatore Bonaccorso
close 982459 5.18.2-1
thanks



Processed: Re: Bug#982459: mdadm examine corrupts host ext4

2022-12-05 Thread Debian Bug Tracking System
Processing control commands:

> retitle -1 mdadm --examine in chroot without /dev mounted corrupts host's 
> filesystem
Bug #982459 [src:linux] mdadm --examine in chroot without /proc,/dev,/sys 
mounted corrupts host's filesystem
Changed Bug title to 'mdadm --examine in chroot without /dev mounted corrupts 
host's filesystem' from 'mdadm --examine in chroot without /proc,/dev,/sys 
mounted corrupts host's filesystem'.
> found -1 5.10.127-2
Bug #982459 [src:linux] mdadm --examine in chroot without /dev mounted corrupts 
host's filesystem
Marked as found in versions linux/5.10.127-2.
> fixed -1 5.18.2-1~bpo11+1
Bug #982459 [src:linux] mdadm --examine in chroot without /dev mounted corrupts 
host's filesystem
Marked as fixed in versions linux/5.18.2-1~bpo11+1.

-- 
982459: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982459
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#982459: mdadm examine corrupts host ext4

2022-12-05 Thread Diederik de Haas
Control: retitle -1 mdadm --examine in chroot without /dev mounted corrupts 
host's filesystem
Control: found -1 5.10.127-2
Control: fixed -1 5.18.2-1~bpo11+1

On Tuesday, 2 August 2022 11:03:09 CET Chris Hofstaedtler wrote:
> Control: reassign -1 src:linux

On 10 Feb 2021 14:29:52 +0100 Patrick Cernko  wrote:
> $MDADM --examine --scan --config=partitions
> 
> If I run this command in a chroot on a machine with md0 as host's root 
> filesystem WITHOUT mounting /proc, /sys and /dev in the chroot, mdadm 
> CORRUPTS the host's root filesystem (/dev/md0 with ext4 filesystem 
> format). I can reproduce this problem every time I do this. 
> 
> Kernel: Linux 5.4.78.1.amd64-smp (SMP w/4 CPU cores)
> Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_USER, 
> TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE

Patrick: AFAICT, that is not a Debian (provided) kernel.
Are or were you able to reproduce this issue with a Debian kernel?
If so, which (exact) version?

> * Håkan T Johansson  [220801 19:31]:
> > On Sun, 31 Jul 2022, Chris Hofstaedtler wrote:
> > > I can't see a difference that should matter from userspace.
> > > 
> > > I have stared a bit at the kernel code... there have been quite some
> > > changes and fixes in this area. Which kernel version were you
> > > running when testing this?
> > > 
> > > Could you retry on something >= 5.9? I.e. some version with patch
> > > 08fc1ab6d748ab1a690fd483f41e2938984ce353.
> > 
> > I believe that I was running 5.10 (bullseye).

Håkan: IIUC, the bug occurs with the 5.10.127-2 kernel.
If you try it with the most recent 5.10 kernel, does the issue still occur?
If we have a 'good' and a 'bad' 5.10 kernel, that would make it easier to
narrow down in which commit it was fixed.

> > It looks like 5.18 (from backports) does not show the issue!  (i.e. works)
> > 
> > host:
> > linux-image-5.18.0-0.bpo.1-amd64  5.18.2-1~bpo11+1
> > 
> > [bug still occurs with]
> > host:
> >linux-image-5.10.0-16-amd64   5.10.127-2

Updated the bug accordingly.

> > This time I did get some dmesg BUG output as well (attached).

For reference [dmesg 1]:
[mån aug  1 15:53:08 2022] BUG: kernel NULL pointer dereference, address: 
0010
[mån aug  1 15:53:08 2022] #PF: supervisor read access in kernel mode
[mån aug  1 15:53:08 2022] #PF: error_code(0x) - not-present page
[mån aug  1 15:53:08 2022] PGD 0 P4D 0 
[mån aug  1 15:53:08 2022] Oops:  [#1] SMP PTI
[mån aug  1 15:53:08 2022] CPU: 2 PID: 284256 Comm: cron Tainted: P   
OE 5.10.0-16-amd64 #1 Debian 5.10.127-2
[mån aug  1 15:53:08 2022] Hardware name: Dell Computer Corporation PowerEdge 
2850/0T7971, BIOS A04 09/22/2005
[mån aug  1 15:53:08 2022] RIP: 0010:__ext4_journal_get_write_access+0x29/0x120 
[ext4]
[mån aug  1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 
41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 
30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00
[mån aug  1 15:53:08 2022] RSP: 0018:ae27c059fd60 EFLAGS: 00010246
[mån aug  1 15:53:08 2022] RAX:  RBX: 9d1b94505480 RCX: 
9d1bc52e5e38
[mån aug  1 15:53:08 2022] RDX: 9d1bc13782d8 RSI: 0c14 RDI: 
c096feb0
[mån aug  1 15:53:08 2022] RBP: 9d1bc52e5e38 R08: 9d1be04d5230 R09: 
0001
[mån aug  1 15:53:08 2022] R10: 9d1bc985f000 R11: 001d R12: 
9d1bc13782d8
[mån aug  1 15:53:08 2022] R13: 9d1be04d5000 R14: 0c14 R15: 
9d1bc13782d8
[mån aug  1 15:53:08 2022] FS:  7fed5ecb1840() 
GS:9d1cd7c8() knlGS:
[mån aug  1 15:53:08 2022] CS:  0010 DS:  ES:  CR0: 80050033
[mån aug  1 15:53:08 2022] CR2: 0010 CR3: 0001a46d8000 CR4: 
06e0
[mån aug  1 15:53:08 2022] Call Trace:
[mån aug  1 15:53:08 2022]  ext4_orphan_del+0x23f/0x290 [ext4]
[mån aug  1 15:53:08 2022]  ext4_evict_inode+0x31f/0x630 [ext4]
[mån aug  1 15:53:08 2022]  evict+0xd1/0x1a0
[mån aug  1 15:53:08 2022]  __dentry_kill+0xe4/0x180
[mån aug  1 15:53:08 2022]  dput+0x149/0x2f0
[mån aug  1 15:53:08 2022]  __fput+0xe4/0x240
[mån aug  1 15:53:08 2022]  task_work_run+0x65/0xa0
[mån aug  1 15:53:08 2022]  exit_to_user_mode_prepare+0x111/0x120
[mån aug  1 15:53:08 2022]  syscall_exit_to_user_mode+0x28/0x140
[mån aug  1 15:53:08 2022]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[mån aug  1 15:53:08 2022] RIP: 0033:0x7fed5eea2d77

> > I also noticed that the BUG: report in dmesg does not happen directly
> > when doing 'mdadm --examine --scan --config=partitions'.  It rather
> > occurs when some activity happens on the host filesystem, e.g.
> > a 'touch /root/a' command.
> > 
> > I have tried with both kernels several times, and it was repeatable that
> > 5.10 got stuck while 5.18 does not show issues.

Repeatable is good :-)
If you have a minimal set of steps to reproduce the issue, can you share that?

> If you have the time, maybe trying the various kernel 

Processed: Re: Bug#982459: mdadm examine corrupts host ext4

2022-08-02 Thread Debian Bug Tracking System
Processing control commands:

> reassign -1 src:linux
Bug #982459 [mdadm] mdadm --examine in chroot without /proc,/dev,/sys mounted 
corrupts host's filesystem
Bug reassigned from package 'mdadm' to 'src:linux'.
No longer marked as found in versions mdadm/4.1-1.
Ignoring request to alter fixed versions of bug #982459 to the same values 
previously set

-- 
982459: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982459
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#982459: mdadm examine corrupts host ext4

2022-08-02 Thread Chris Hofstaedtler
Control: reassign -1 src:linux

Dear Håkan,

thanks for reporting back and testing!

* Håkan T Johansson  [220801 19:31]:
> On Sun, 31 Jul 2022, Chris Hofstaedtler wrote:
> 
> > I can't see a difference that should matter from userspace.
> > 
> > I have stared a bit at the kernel code... there have been quite some
> > changes and fixes in this area. Which kernel version were you
> > running when testing this?
> > 
> > Could you retry on something >= 5.9? I.e. some version with patch
> >08fc1ab6d748ab1a690fd483f41e2938984ce353.
> 
> I believe that I was running 5.10 (bullseye).
> 
> It looks like 5.18 (from backports) does not show the issue!  (i.e. works)

Okay, I think we are now clearly in "this is not an mdadm bug per
se" territory (-> reassigning to src:linux).

[..]
>   This time I did get some dmesg BUG output as well (attached).
>   It does not seem to be the same backtrace on two occurances.
> 
>   I also noticed that the BUG: report in dmesg does not happen directly
>   when doing 'mdadm --examine --scan --config=partitions'.  It rather
>   occurs when some activity happens on the host filesystem, e.g.
>   a 'touch /root/a' command.
> 
> host:
>   linux-image-5.18.0-0.bpo.1-amd64  5.18.2-1~bpo11+1
> 
>   (did not re-install anything else, except upgraded zfs, also from
>   backports (since pure bullseye would not compile with 5.18))
> 
>   Does not exhibit the problem.
> 
> I have tried with both kernels several times, and it was repeatable that
> 5.10 got stuck while 5.18 does not show issues.

Its good that this now works in 5.18. However I'm not sure how we
should find the commit fixing this - in 5.14 lots of block layer
code was shuffled around/refactored.

If you have the time, maybe trying the various kernel versions
between 5.10 and 5.18 would be a good start. If they are not in
backports anymore, they should still be at
  http://snapshot.debian.org/package/linux/

> Reminder: to get the issue, /dev/ should not be mounted in the chroot.
> With /dev/ mounted, 5.10 also works.

I'll see if I can repro this on 5.10, but need to find a box first.

Best,
Chris

> [mån aug  1 15:53:08 2022] BUG: kernel NULL pointer dereference, address: 
> 0010
> [mån aug  1 15:53:08 2022] #PF: supervisor read access in kernel mode
> [mån aug  1 15:53:08 2022] #PF: error_code(0x) - not-present page
> [mån aug  1 15:53:08 2022] PGD 0 P4D 0 
> [mån aug  1 15:53:08 2022] Oops:  [#1] SMP PTI
> [mån aug  1 15:53:08 2022] CPU: 2 PID: 284256 Comm: cron Tainted: P   
> OE 5.10.0-16-amd64 #1 Debian 5.10.127-2
> [mån aug  1 15:53:08 2022] Hardware name: Dell Computer Corporation PowerEdge 
> 2850/0T7971, BIOS A04 09/22/2005
> [mån aug  1 15:53:08 2022] RIP: 
> 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4]
> [mån aug  1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 
> 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 
> 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00
> [mån aug  1 15:53:08 2022] RSP: 0018:ae27c059fd60 EFLAGS: 00010246
> [mån aug  1 15:53:08 2022] RAX:  RBX: 9d1b94505480 RCX: 
> 9d1bc52e5e38
> [mån aug  1 15:53:08 2022] RDX: 9d1bc13782d8 RSI: 0c14 RDI: 
> c096feb0
> [mån aug  1 15:53:08 2022] RBP: 9d1bc52e5e38 R08: 9d1be04d5230 R09: 
> 0001
> [mån aug  1 15:53:08 2022] R10: 9d1bc985f000 R11: 001d R12: 
> 9d1bc13782d8
> [mån aug  1 15:53:08 2022] R13: 9d1be04d5000 R14: 0c14 R15: 
> 9d1bc13782d8
> [mån aug  1 15:53:08 2022] FS:  7fed5ecb1840() 
> GS:9d1cd7c8() knlGS:
> [mån aug  1 15:53:08 2022] CS:  0010 DS:  ES:  CR0: 80050033
> [mån aug  1 15:53:08 2022] CR2: 0010 CR3: 0001a46d8000 CR4: 
> 06e0
> [mån aug  1 15:53:08 2022] Call Trace:
> [mån aug  1 15:53:08 2022]  ext4_orphan_del+0x23f/0x290 [ext4]
> [mån aug  1 15:53:08 2022]  ext4_evict_inode+0x31f/0x630 [ext4]
> [mån aug  1 15:53:08 2022]  evict+0xd1/0x1a0
> [mån aug  1 15:53:08 2022]  __dentry_kill+0xe4/0x180
> [mån aug  1 15:53:08 2022]  dput+0x149/0x2f0
> [mån aug  1 15:53:08 2022]  __fput+0xe4/0x240
> [mån aug  1 15:53:08 2022]  task_work_run+0x65/0xa0
> [mån aug  1 15:53:08 2022]  exit_to_user_mode_prepare+0x111/0x120
> [mån aug  1 15:53:08 2022]  syscall_exit_to_user_mode+0x28/0x140
> [mån aug  1 15:53:08 2022]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [mån aug  1 15:53:08 2022] RIP: 0033:0x7fed5eea2d77
> [mån aug  1 15:53:08 2022] Code: 44 00 00 48 8b 15 19 a1 0c 00 f7 d8 64 89 02 
> b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f 
> 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e9 a0 0c 00 f7 d8 64 89 02 b8
> [mån aug  1 15:53:08 2022] RSP: 002b:7ffd50452818 EFLAGS: 0202 
> ORIG_RAX: 0003
> [mån aug  1 15:53:08 2022] RAX:  RBX: 55dab4578910 RCX: 
> 7fed5eea2d77
> [mån 

Bug#982459: mdadm examine corrupts host ext4

2022-08-01 Thread Håkan T Johansson


On Sun, 31 Jul 2022, Chris Hofstaedtler wrote:


I can't see a difference that should matter from userspace.

I have stared a bit at the kernel code... there have been quite some
changes and fixes in this area. Which kernel version were you
running when testing this?

Could you retry on something >= 5.9? I.e. some version with patch
   08fc1ab6d748ab1a690fd483f41e2938984ce353.


Dear Chris,

I believe that I was running 5.10 (bullseye).

It looks like 5.18 (from backports) does not show the issue!  (i.e. works)

Some more details:

I have now tried again:

host:
  linux-image-5.10.0-16-amd64   5.10.127-2
  mdadm 4.2-1~bpo11+1
chroot:
  mdadm 4.1-11

  Some more details:

  This time I did get some dmesg BUG output as well (attached).
  It does not seem to be the same backtrace on two occurances.

  I also noticed that the BUG: report in dmesg does not happen directly
  when doing 'mdadm --examine --scan --config=partitions'.  It rather
  occurs when some activity happens on the host filesystem, e.g.
  a 'touch /root/a' command.

host:
  linux-image-5.18.0-0.bpo.1-amd64  5.18.2-1~bpo11+1

  (did not re-install anything else, except upgraded zfs, also from
  backports (since pure bullseye would not compile with 5.18))

  Does not exhibit the problem.

I have tried with both kernels several times, and it was repeatable that 
5.10 got stuck while 5.18 does not show issues.


Reminder: to get the issue, /dev/ should not be mounted in the chroot.
With /dev/ mounted, 5.10 also works.

Best regards,
Håkan[mÃ¥n aug  1 15:53:08 2022] BUG: kernel NULL pointer dereference, address: 
0010
[mån aug  1 15:53:08 2022] #PF: supervisor read access in kernel mode
[mån aug  1 15:53:08 2022] #PF: error_code(0x) - not-present page
[mån aug  1 15:53:08 2022] PGD 0 P4D 0 
[mån aug  1 15:53:08 2022] Oops:  [#1] SMP PTI
[mån aug  1 15:53:08 2022] CPU: 2 PID: 284256 Comm: cron Tainted: P   
OE 5.10.0-16-amd64 #1 Debian 5.10.127-2
[mån aug  1 15:53:08 2022] Hardware name: Dell Computer Corporation PowerEdge 
2850/0T7971, BIOS A04 09/22/2005
[mån aug  1 15:53:08 2022] RIP: 
0010:__ext4_journal_get_write_access+0x29/0x120 [ext4]
[mån aug  1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 
41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 
30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00
[mån aug  1 15:53:08 2022] RSP: 0018:ae27c059fd60 EFLAGS: 00010246
[mån aug  1 15:53:08 2022] RAX:  RBX: 9d1b94505480 RCX: 
9d1bc52e5e38
[mån aug  1 15:53:08 2022] RDX: 9d1bc13782d8 RSI: 0c14 RDI: 
c096feb0
[mån aug  1 15:53:08 2022] RBP: 9d1bc52e5e38 R08: 9d1be04d5230 R09: 
0001
[mån aug  1 15:53:08 2022] R10: 9d1bc985f000 R11: 001d R12: 
9d1bc13782d8
[mån aug  1 15:53:08 2022] R13: 9d1be04d5000 R14: 0c14 R15: 
9d1bc13782d8
[mån aug  1 15:53:08 2022] FS:  7fed5ecb1840() 
GS:9d1cd7c8() knlGS:
[mån aug  1 15:53:08 2022] CS:  0010 DS:  ES:  CR0: 80050033
[mån aug  1 15:53:08 2022] CR2: 0010 CR3: 0001a46d8000 CR4: 
06e0
[mån aug  1 15:53:08 2022] Call Trace:
[mån aug  1 15:53:08 2022]  ext4_orphan_del+0x23f/0x290 [ext4]
[mån aug  1 15:53:08 2022]  ext4_evict_inode+0x31f/0x630 [ext4]
[mån aug  1 15:53:08 2022]  evict+0xd1/0x1a0
[mån aug  1 15:53:08 2022]  __dentry_kill+0xe4/0x180
[mån aug  1 15:53:08 2022]  dput+0x149/0x2f0
[mån aug  1 15:53:08 2022]  __fput+0xe4/0x240
[mån aug  1 15:53:08 2022]  task_work_run+0x65/0xa0
[mån aug  1 15:53:08 2022]  exit_to_user_mode_prepare+0x111/0x120
[mån aug  1 15:53:08 2022]  syscall_exit_to_user_mode+0x28/0x140
[mån aug  1 15:53:08 2022]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[mån aug  1 15:53:08 2022] RIP: 0033:0x7fed5eea2d77
[mån aug  1 15:53:08 2022] Code: 44 00 00 48 8b 15 19 a1 0c 00 f7 d8 64 89 02 
b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f 
05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e9 a0 0c 00 f7 d8 64 89 02 b8
[mån aug  1 15:53:08 2022] RSP: 002b:7ffd50452818 EFLAGS: 0202 
ORIG_RAX: 0003
[mån aug  1 15:53:08 2022] RAX:  RBX: 55dab4578910 RCX: 
7fed5eea2d77
[mån aug  1 15:53:08 2022] RDX: 7fed5ef6e8a0 RSI:  RDI: 
0006
[mån aug  1 15:53:08 2022] RBP:  R08:  R09: 
7fed5ef6dbe0
[mån aug  1 15:53:08 2022] R10: 006f R11: 0202 R12: 
7fed5ef6f4a0
[mån aug  1 15:53:08 2022] R13:  R14:  R15: 
0001
[mån aug  1 15:53:08 2022] Modules linked in: msr autofs4 nfsd auth_rpcgss 
nfsv3 nfs_acl nfs lockd grace sunrpc nfs_ssc fscache xt_mac xt_length xt_recent 
xt_multiport xt_tcpudp xt_state xt_conntrack 

Bug#982459: mdadm examine corrupts host ext4

2022-07-30 Thread Chris Hofstaedtler
Hi Håkan,

* Håkan T Johansson  [220730 23:43]:
> I have now tried with the mdadm 4.2~rc2-2 installed in both the chroot
> environment (tried only that first), and also the host system.
> Unfortunately, the host / fs is still affected when running
> 'update-initramfs -u', when /dev is not mounted.
[..]

> is kind of readable, though, then I'm lost.

I can't see a difference that should matter from userspace.

I have stared a bit at the kernel code... there have been quite some
changes and fixes in this area. Which kernel version were you
running when testing this?

Could you retry on something >= 5.9? I.e. some version with patch
08fc1ab6d748ab1a690fd483f41e2938984ce353.

Thanks,
Chris



Bug#982459:

2021-08-15 Thread Felix Lechner
Hi,

On Sun, Aug 15, 2021 at 2:45 AM Håkan T Johansson  wrote:
>
> I believe that I have been hit by this bug too.

Thanks for the bug amendment! The 4.1 release happened nearly three
years ago. With bullseye released, I just uploaded the latest release
candidate 4.2~rc2-2 from upstream to Debian unstable. Feel free to try
that too. Thank you!

Kind regards
Felix Lechner



Bug#982459:

2021-08-15 Thread Håkan T Johansson


Hi,

I believe that I have been hit by this bug too.

What has happened for me is that the machine in question 'almost' locks 
up, with a read-only /, and such that most commands to debug further never 
complete due to waiting for filesystem action.  It then requires a reboot.


'dmesg' has worked, and then shows ext4-related issues.  However, they 
were not recorded to /var/log.  I generally do not find any corruption on 
the filesystem itself when running fsck afterwards.


On the machine I have a number of chroot debian installations of different 
releases. By pure chance I found that 'update-initramfs' was the trigger 
for the system hangs. I could then repeatably trigger the issue again.
(Before this, it would happen as part of system maintenance (unattended 
upgrades in the chroots), so just spuriously hang the machine.)


In my case, the chroot installations live on a ZFS filesystem.  But the 
host system itself is on (multiple; /, /usr/, /var/ ) MD raid1.


I have had /proc mounted in the chroots.  But had forgotten /dev .  After 
mounting /dev (and /dev/pts) in the chroots, the issue has not happened 
again.


The issue was when the host system ran Buster, I then upgraded to Bullseye 
~2 weeks ago, hoping it would be resolved, but the issue was still present 
after the upgrade.  Only after that upgrade I found the update-initramfs 
trigger.


I am running with sysvinit, both on host and chroots.

Currently, I do not have hands-on access to the system, so cannot inspect 
or reboot it reliably.  Should be able to do some further tests in a few 
weeks.


Best regards,
Håkan

Bug#982459: mdadm --examine in chroot without /proc,/dev,/sys mounted corrupts host's filesystem

2021-07-30 Thread Felix Lechner
Hi,

On Tue, Jul 13, 2021 at 12:42 AM Judit Foglszinger  wrote:
>
> tried again but still fail to reproduce

Thanks for trying to reproduce this bug! I am not sure it makes any
difference either way, but I recently uploaded upstream's new release
candidate 4.2~rc1 to experimental:

https://packages.debian.org/source/experimental/mdadm

Kind regards
Felix Lechner



Bug#982459: mdadm --examine in chroot without /proc,/dev,/sys mounted corrupts host's filesystem

2021-07-13 Thread Judit Foglszinger
Hi,

> I could reproduce the bug with /dev *NOT* mounted in chroot. It seems 
> independent of /sys being mounted in chroot.

tried again but still fail to reproduce 
(same configuration as last time, just with /proc mounted to chroot/proc, rest 
not mounted).

Additionally tried it with a RAID0 and also to install a kernel with initrd to 
the chroot,
though again didn't manage to get the host file system corrupted.
(system used for that second try was RC2 of bullseye on virtualbox,
raid was configured using the Debian installer)

I think, I need to give up on this.  Maybe someone else has an idea.


signature.asc
Description: This is a digitally signed message part.


Bug#982459: mdadm --examine in chroot without /proc,/dev,/sys mounted corrupts host's filesystem

2021-06-21 Thread Patrick Cernko

Hi,

On 18.06.21 12:48, Patrick Cernko wrote:


I will try to reproduce the bug now with one of /dev or /sys mounted and 
check if it still occurs or not. I will send my report about this later 
as this will take some time again.




I could reproduce the bug with /dev *NOT* mounted in chroot. It seems 
independent of /sys being mounted in chroot.


Best Regards,
--
Patrick Cernko 
Joint Administration: Information Services and Technology
Max-Planck-Institute fuer Informatik & Softwaresysteme



smime.p7s
Description: S/MIME Cryptographic Signature


Bug#982459: mdadm --examine in chroot without /proc,/dev,/sys mounted corrupts host's filesystem

2021-06-18 Thread Patrick Cernko

Hi,

On 25.04.21 00:36, Judit Foglszinger wrote:


can you reproduce this bug on bullseye? (4.1-11)
If so, what is your configuration (VM used, type of RAID)?
Are all three conditions (/proc, /dev and /sys not mounted) required
or does this also happen, if eg /dev and /sys are there but not /proc?

If it still occurs until there would be a proper fix by upstream,
a workaround like "are we in a chroot, if so,
are the required things mounted, if not, fail",
could be used to avoid the file system corruption.

My own observations:

Could not reproduce in virtualbox (both chroot and host system using recent 
bullseye),
using RAID1,  /dev/md0 on / type ext4 (rw,relatime,errors=remount-ro)

# chroot chroot
/ # mdadm --examine --scan --config=partitions
/ # mdadm: cannot open /proc/partitions
/ # mdadm: No devices listed in partitions

(in background on host running the mentioned find / command)

No filesystem corruption after over 15 minutes,
running the mdadm command in chroot several times didn't make a difference on 
that.



I'm really sorry: Somehow I missed this mail when it came in my inbox 6 
weeks ago. I only recognized the answer when I checked bugs.debian.org 
last week.


I tried to reproduce the bug again and discovered, that my description 
contained a serious error: In fact /proc MUST be mounted in the chroot 
to observe the bug!


I also could reproduce the bug with mdadm-4.1-11 (from bullseye) 
installed in the buster chroot (all other packages still from buster).


I will try to reproduce the bug now with one of /dev or /sys mounted and 
check if it still occurs or not. I will send my report about this later 
as this will take some time again.


Sorry for the delayed answer and the error in my initial bug report.

Best Regards,
--
Patrick Cernko 
Joint Administration: Information Services and Technology
Max-Planck-Institute fuer Informatik & Softwaresysteme



smime.p7s
Description: S/MIME Cryptographic Signature


Bug#982459: mdadm --examine in chroot without /proc,/dev,/sys mounted corrupts host's filesystem

2021-04-24 Thread Judit Foglszinger
tags 982459 +moreinfo
user debian-rele...@lists.debian.org
usertags -1 + bsp-2021-04-AT-Salzburg
thank you

Hi,

can you reproduce this bug on bullseye? (4.1-11)
If so, what is your configuration (VM used, type of RAID)?
Are all three conditions (/proc, /dev and /sys not mounted) required
or does this also happen, if eg /dev and /sys are there but not /proc?

If it still occurs until there would be a proper fix by upstream,
a workaround like "are we in a chroot, if so,
are the required things mounted, if not, fail",
could be used to avoid the file system corruption.

My own observations:

Could not reproduce in virtualbox (both chroot and host system using recent 
bullseye),
using RAID1,  /dev/md0 on / type ext4 (rw,relatime,errors=remount-ro)

# chroot chroot 
/ # mdadm --examine --scan --config=partitions
/ # mdadm: cannot open /proc/partitions
/ # mdadm: No devices listed in partitions

(in background on host running the mentioned find / command)

No filesystem corruption after over 15 minutes,
running the mdadm command in chroot several times didn't make a difference on 
that.


signature.asc
Description: This is a digitally signed message part.


Bug#982459: mdadm --examine in chroot without /proc,/dev,/sys mounted corrupts host's filesystem

2021-02-10 Thread Patrick Cernko

Package: mdadm
Version: 4.1-1
Severity: critical
Tags: upstream



When installing a kernel with initrd enabled, initramfs-tools calls 
/usr/share/initramfs-tools/hooks/mdadm. Doing this in a chroot 
previously created with debootstrap causes the hook to call


$MDADM --examine --scan --config=partitions

If I run this command in a chroot on a machine with md0 as host's root 
filesystem WITHOUT mounting /proc, /sys and /dev in the chroot, mdadm 
CORRUPTS the host's root filesystem (/dev/md0 with ext4 filesystem 
format). I can reproduce this problem every time I do this. To detect 
it, I made a background job reading all file in /:


> while sleep 1; do
>   find / -xdev -type f -exec cat {} + > /dev/null
>   echo 3 > /proc/sys/vm/drop_caches # drop caches to for re-read
> done
A few seconds to minutes after invoking the corrupting command, you can 
see messages like this in kernel log:


> EXT4-fs error (device md0): ext4_validate_inode_bitmap:100: comm 
uptimed: Corrupt inode bitmap - block_group = 96, inode_bitmap = 3145744
> EXT4-fs error (device md0): ext4_validate_block_bitmap:384: comm 
uptimed: bg 97: bad block bitmap checksum
> EXT4-fs error (device md0) in ext4_free_blocks:4964: Filesystem 
failed CRC

> EXT4-fs error (device md0) in ext4_free_inode:357: Corrupt filesystem
We did not try to repair such filesystems but reinstalled the machine 
every time this occured while investigating.


I tried to debug the problem and could bring it down to a 
BLKPG_DEL_PARTITION ioctl issued on a temporary device inode created by 
mdadm while running. This call is done in


util.c:int test_partition(int fd)

which is (somehow) called by

Examine.c:int Examine(...)


Invoking the same command in the chroot after mounting /dev, /proc and 
/sys in the chroot does not corrupt the host's filesystem.



Please forward this bug report to upstream in order to get a 
fix/workaround or at least a huge warning implemented in mdadm to avoid 
data corruption for other users.




-- Package-specific info:


-- System Information:
Debian Release: 10.7
  APT prefers proposed-updates
  APT policy: (500, 'proposed-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.4.78.1.amd64-smp (SMP w/4 CPU cores)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_USER, 
TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), 
LANGUAGE=de_DE.UTF-8 (charmap=UTF-8)

Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages mdadm depends on:
ii  debconf [debconf-2.0]  1.5.71
ii  libc6  2.28-10
ii  lsb-base   10.2019051400
ii  udev   241-7~deb10u5

Versions of packages mdadm recommends:
ii  exim4-daemon-light [mail-transport-agent]  4.92-8+deb10u4
ii  kmod   26-1

Versions of packages mdadm suggests:
pn  dracut-core  

-- Configuration Files:
/etc/cron.daily/mdadm [Errno 2] Datei oder Verzeichnis nicht gefunden: 
'/etc/cron.daily/mdadm'


-- debconf information excluded
/* O_DIRECT */
#define _GNU_SOURCE

/* mknod */
#include 
#include 
#include 
#include 

/* printf */
#include 

/* open */
#include 
#include 
#include 

/* errno */
#include 

/* ioctl */
#include 

/* BLKPG */
/* #include  */
/*
 * following taken from linux/blkpg.h because they aren't
 * anywhere else and it isn't safe to #include linux/ * stuff.
 */

#define BLKPG  _IO(0x12,105)

/* The argument structure */
struct blkpg_ioctl_arg {
	int op;
	int flags;
	int datalen;
	void *data;
};

/* The subfunctions (for the op field) */
#define BLKPG_ADD_PARTITION	1
#define BLKPG_DEL_PARTITION	2

/* Sizes of name fields. Unused at present. */
#define BLKPG_DEVNAMELTH	64
#define BLKPG_VOLNAMELTH	64

/* The data structure for ADD_PARTITION and DEL_PARTITION */
struct blkpg_partition {
	long long start;		/* starting offset in bytes */
	long long length;		/* length in bytes */
	int pno;			/* partition number */
	char devname[BLKPG_DEVNAMELTH];	/* partition name, like sda5 or c0d1p2,
	   to be used in kernel messages */
	char volname[BLKPG_VOLNAMELTH];	/* volume label */
};

/* memset */
#include 

/* lseek */
#include 
#include 

/* BLKGETSIZE64 */
#ifndef BLKGETSIZE64
#define BLKGETSIZE64 _IOR(0x12,114,size_t) /* return device size in bytes (u64 *arg) */
#endif


#define align(p, a) (((long)(p) + (a - 1)) & ~(a - 1))




int do_seek(int fd, unsigned long long offset, int whence) {
  printf("lseek(%d, %ld, SEEK_SET)\n", fd, offset);
  printf("lseek()=%ld\n", lseek(fd, offset, whence));
  return 1;
}

int do_read(int fd, size_t size) {
  char unalignedbuffer[1];
  char* buffer;
  ssize_t bytes;
  if (size > 4096) {
printf("Reading %d bytes not supported yet!\n", size);
return 0;
  }
  buffer = (char*)align(unalignedbuffer, 512);
  printf("read(%d, buffer, %d)\n", fd, size);
  bytes = read(fd, buffer, size);
  if (bytes < 0) {
printf("read()=%d,