Re: Kernel message: BUG: Bad page state in process kworker

2021-03-13 Thread Reco
On Sat, Mar 13, 2021 at 03:01:45AM -0500, Michael Grant wrote:
> > I'd say it is a Linode problem, unless you run custom kernel modules.
> > It looks like a "memory" corruption to me and since it is virtualized 
> > system,
> > you should check if host system is ok.
> > Memory in quotes because this issue could be also related to a storage
> > sub-system (local or network attached) of the host or VM.
> 
> I'm definitely not running a custom kernel.

You do run non-Debian kernel. It says so in your dmesg:

kworker/0:3 Tainted: GB 5.10.13-x86_64-linode141 #1

Reco



Re: Kernel message: BUG: Bad page state in process kworker

2021-03-13 Thread Michael Grant
> I'd say it is a Linode problem, unless you run custom kernel modules.
> It looks like a "memory" corruption to me and since it is virtualized system,
> you should check if host system is ok.
> Memory in quotes because this issue could be also related to a storage
> sub-system (local or network attached) of the host or VM.

I'm definitely not running a custom kernel.

They live migrated this linode to a new physical machine and still
getting these errors so I don't think it's a h/w issue.  It seems to
be a qemu issue to me but I am not very familiar with qemu.

The fact that they live migrated it does seem to imply it's all part
of the same virtualization system, probably same qemu, which would
make sense that it didn't fix the error.

I'd like to know if anyone else is seeing this who is using linode and
has recently been migrated to their new metal.


signature.asc
Description: PGP signature


Re: Kernel message: BUG: Bad page state in process kworker

2021-03-12 Thread Alexander V. Makartsev

On 13.03.2021 11:31, Michael Grant wrote:

I'm seeing lots of errors like this in my kern.log on 2 of 3 of my
deban Linodes running testing on Linode's provided kerne 5.10.13.  Is
this a problem in Debian or is this a Linode issue?

Mar 12 19:32:18 strange kernel: [10849.820363] BUG: Bad page state in process 
kworker/0:3  pfn:10902f
Mar 12 19:32:18 strange kernel: [10849.825374] page:edbc1187 
refcount:-1 mapcount:0 mapping: index:0x0 pfn:0x10902f
Mar 12 19:32:18 strange kernel: [10849.827665] flags: 0x200()
Mar 12 19:32:18 strange kernel: [10849.828894] raw: 0200 
dead0100 dead0122 
Mar 12 19:32:18 strange kernel: [10849.830817] raw:  
0011  
Mar 12 19:32:18 strange kernel: [10849.832758] page dumped because: nonzero 
_refcount
Mar 12 19:32:18 strange kernel: [10849.834141] Modules linked in:
Mar 12 19:32:18 strange kernel: [10849.835229] CPU: 0 PID: 769 Comm: 
kworker/0:3 Tainted: GB 5.10.13-x86_64-linode141 #1
Mar 12 19:32:18 strange kernel: [10849.837423] Hardware name: QEMU Standard PC 
(Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Mar 12 19:32:18 strange kernel: [10849.840005] Workqueue: mm_percpu_wq 
drain_local_pages_wq
Mar 12 19:32:18 strange kernel: [10849.841538] Call Trace:
Mar 12 19:32:18 strange kernel: [10849.842620]  dump_stack+0x6d/0x88
Mar 12 19:32:18 strange kernel: [10849.843813]  bad_page.cold.119+0x63/0x93
Mar 12 19:32:18 strange kernel: [10849.845096]  free_pcppages_bulk+0x18e/0x6a0
Mar 12 19:32:18 strange kernel: [10849.846423]  drain_pages_zone+0x41/0x50
Mar 12 19:32:18 strange kernel: [10849.847740]  drain_pages+0x3c/0x50
Mar 12 19:32:18 strange kernel: [10849.848956]  drain_local_pages_wq+0xe/0x10
Mar 12 19:32:18 strange kernel: [10849.850282]  process_one_work+0x1fb/0x390
Mar 12 19:32:18 strange kernel: [10849.851579]  ? process_one_work+0x390/0x390
Mar 12 19:32:18 strange kernel: [10849.852929]  worker_thread+0x221/0x3e0
Mar 12 19:32:18 strange kernel: [10849.854159]  ? process_one_work+0x390/0x390
Mar 12 19:32:18 strange kernel: [10849.855648]  kthread+0x116/0x130
Mar 12 19:32:18 strange kernel: [10849.857066]  ? kthread_park+0x80/0x80
Mar 12 19:32:18 strange kernel: [10849.858424]  ret_from_fork+0x22/0x30

I'd say it is a Linode problem, unless you run custom kernel modules.
It looks like a "memory" corruption to me and since it is virtualized 
system, you should check if host system is ok.
Memory in quotes because this issue could be also related to a storage 
sub-system (local or network attached) of the host or VM.



--
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄