Re: [CentOS-virt] Major stability problems with xen 4.6.6
> It seems the patch you mentioned was merged to upstream Linux here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > d=71472fa9c52b1da27663c275d416d8654b905f05 > > and then reverted/removed here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > d=896d81fefe5d1919537db2c2150ab6384e4a6610 > > Do you know if there has been proper/fixed patch after that? has it been > merged to upstream Linux kernel already? Interesting! I didn't come across that when digging into this. It looks like this hasn't been followed up on at all since April: https://lists.gt.net/engine?list=linux;do=search_results;search_type=AND;sea rch_forum=forum_1;search_string=ldisc%20reopened&sb=post_time Currently I've got ~40 dom0's running with the patch on 4.9.44-39 and it's resolved all stability issues, previously I was seeing multiple crashes a week. Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Major stability problems with xen 4.6.6
Hi, On Thu, Aug 24, 2017 at 03:45:46PM -0700, Nathan March wrote: >Just in case anyone else on this list is running into similar issues, I >can confirm that the patch appears to have resolved this. > > >I've opened [1]https://bugs.centos.org/view.php?id=13713 > > >It was so bad that having the system under load (with rpmbuild) and >opening another ssh window or two would almost always cause the oops. > It seems the patch you mentioned was merged to upstream Linux here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=71472fa9c52b1da27663c275d416d8654b905f05 and then reverted/removed here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=896d81fefe5d1919537db2c2150ab6384e4a6610 Do you know if there has been proper/fixed patch after that? has it been merged to upstream Linux kernel already? Thanks, -- Pasi > > >Cheers, > >Nathan > > > >From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of >Nathan March >Sent: Wednesday, August 23, 2017 3:32 PM >To: 'Discussion about the virtualization on CentOS' > >Subject: Re: [CentOS-virt] Major stability problems with xen 4.6.6 > > > >This appears to be a centos kernel issue rather than a xen one. > > > >[2]https://lkml.org/lkml/2016/5/17/440 > > > >Digging through the posts and not clear why this never made it upstream... > > > >I'm going to apply that patch to my systems and see if it resolves, but >won't know for certain until a week or two of stability goes by. > > > >- Nathan > > > > > >From: CentOS-virt [[3]mailto:centos-virt-boun...@centos.org] On Behalf Of >Nathan March >Sent: Wednesday, August 23, 2017 2:48 PM >To: [4]centos-virt@centos.org >Subject: [CentOS-virt] Major stability problems with xen 4.6.6 > > > >Hi, > > > >I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both >the 4.9.34-29 and 4.9.39-29 kernels. > > > >I've attached a txt with two different servers outputs. > > > >Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 > >Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and >4.9.34-29 > > > >Both are on different hardware platforms, and have had a long history of >being stable until these upgrades. > > > >It sounds potentially related to > > [5]https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstable/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ >but I've confirmed this patch is in the above kernels. > > > >Any suggestions / thoughts? > > > >Cheers, > >Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Major stability problems with xen 4.6.6
Just in case anyone else on this list is running into similar issues, I can confirm that the patch appears to have resolved this. I've opened https://bugs.centos.org/view.php?id=13713 It was so bad that having the system under load (with rpmbuild) and opening another ssh window or two would almost always cause the oops. Cheers, Nathan From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 3:32 PM To: 'Discussion about the virtualization on CentOS' Subject: Re: [CentOS-virt] Major stability problems with xen 4.6.6 This appears to be a centos kernel issue rather than a xen one. https://lkml.org/lkml/2016/5/17/440 Digging through the posts and not clear why this never made it upstream. I'm going to apply that patch to my systems and see if it resolves, but won't know for certain until a week or two of stability goes by. - Nathan From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 2:48 PM To: centos-virt@centos.org <mailto:centos-virt@centos.org> Subject: [CentOS-virt] Major stability problems with xen 4.6.6 Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Major stability problems with xen 4.6.6
This appears to be a centos kernel issue rather than a xen one. https://lkml.org/lkml/2016/5/17/440 Digging through the posts and not clear why this never made it upstream. I'm going to apply that patch to my systems and see if it resolves, but won't know for certain until a week or two of stability goes by. - Nathan From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 2:48 PM To: centos-virt@centos.org Subject: [CentOS-virt] Major stability problems with xen 4.6.6 Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Major stability problems with xen 4.6.6
Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan Aug 23 10:19:31 xen-028 kernel: [590071.735515] BUG: unable to handle kernel paging request at 2260 Aug 23 10:19:31 xen-028 kernel: [590071.735795] IP: [] n_tty_receive_buf_common+0xa4/0x1f0 Aug 23 10:19:31 xen-028 kernel: [590071.736031] PGD 0 Aug 23 10:19:31 xen-028 kernel: [590071.736083] Aug 23 10:19:31 xen-028 kernel: [590071.736300] Oops: [#1] SMP Aug 23 10:19:31 xen-028 kernel: [590071.736470] Modules linked in: ebt_ip6 ebt_ip ebtable_filter ebtables arptable_filter arp_tables bridge xen_pciback xen_gntalloc nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd sunrpc grace 8021q mrp garp stp llc bonding blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd ipmi_devintf ipmi_si ipmi_msghandler gpio_ich iTCO_wdt iTCO_vendor_support fjes acpi_power_meter dcdbas pcspkr serio_raw joydev lpc_ich igb ixgbe dca ptp pps_core mdio i7core_edac edac_core bnx2 raid1 megaraid_sas ttm Aug 23 10:19:31 xen-028 kernel: [590071.740051] CPU: 14 PID: 21615 Comm: kworker/u48:1 Not tainted 4.9.39-29.el6.x86_64 #1 Aug 23 10:19:31 xen-028 kernel: [590071.740330] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.0.7 08/18/2011 Aug 23 10:19:31 xen-028 kernel: [590071.740607] Workqueue: events_unbound flush_to_ldisc Aug 23 10:19:31 xen-028 kernel: [590071.740806] task: 88008a6011c0 task.stack: c9004cfec000 Aug 23 10:19:31 xen-028 kernel: [590071.740966] RIP: e030:[] [] n_tty_receive_buf_common+0xa4/0x1f0 Aug 23 10:19:31 xen-028 kernel: [590071.741282] RSP: e02b:c9004cfefb08 EFLAGS: 00010296 Aug 23 10:19:31 xen-028 kernel: [590071.741442] RAX: 2260 RBX: RCX: 000a Aug 23 10:19:31 xen-028 kernel: [590071.741714] RDX: RSI: 88015ecd6420 RDI: 8800afd654d8 Aug 23 10:19:31 xen-028 kernel: [590071.741994] RBP: c9004cfefb78 R08: 0001 R09: 81f0af00 Aug 23 10:19:31 xen-028 kernel: [590071.742274] R10: 7ff0 R11: 0078 R12: 000a Aug 23 10:19:31 xen-028 kernel: [590071.742549] R13: 8800afd65400 R14: R15: 88015ecd6420 Aug 23 10:19:31 xen-028 kernel: [590071.742830] FS: 7f81da7317c0() GS:8801c098() knlGS: Aug 23 10:19:31 xen-028 kernel: [590071.743112] CS: e033 DS: ES: CR0: 80050033 Aug 23 10:19:31 xen-028 kernel: [590071.743283] CR2: 2260 CR3: 8f61f000 CR4: 2660 Aug 23 10:19:31 xen-028 kernel: [590071.743564] Stack: Aug 23 10:19:31 xen-028 kernel: [590071.743719] c900116c 8800afd654d8 0001c070 Aug 23 10:19:31 xen-028 kernel: [590071.744149] 2260 8a603340 8801c0997000 Aug 23 10:19:31 xen-028 kernel: [590071.744577] 8801c098b890 88015ecd6400 8800b19e9c00 c9004cfefbf8 Aug 23 10:19:31 xen-028 kernel: [590071.745008] Call Trace: Aug 23 10:19:31 xen-028 kernel: [590071.745169] [] n_tty_receive_buf2+0x14/0x20 Aug 23 10:19:31 xen-028 kernel: [590071.745335] [] tty_ldisc_receive_buf+0x23/0x50 Aug 23 10:19:31 xen-028 kernel: [590071.745501] [] flush_to_ldisc+0xc8/0x100 Aug 23 10:19:31 xen-028 kernel: [590071.745669] [] ? __switch_to+0x1dc/0x680 Aug 23 10:19:31 xen-028 kernel: [590071.745836] [] process_one_work+0x170/0x500 Aug 23 10:19:31 xen-028 kernel: [590071.746005] [] ? __schedule+0x238/0x530 Aug 23 10:19:31 xen-028 kernel: [590071.746169] [] ? maybe_create_worker+0x94/0x120 Aug 23 10:19:31 xen-028 kernel: [590071.746342] [] ? schedule+0x3a/0xa0 Aug 23 10:19:31 xen-028 kernel: [590071.746506] [] worker_thread+0x166/0x580 Aug 23 10:19:31 xen-028 kernel: [590071.746671] [] ? __schedule+0x238/0x530 Aug 23 10:19:31 xen-028 kernel: [590071.749537] [] ? default_wake_function+0x12/0x20 Aug 23 10:19:31 xen-028 kernel: [590071.749706] [] ? maybe_create_worker+0x120/0x120 Aug 23 10:19:31 xen-028 kernel: [590071.749872] [] ? schedule+0x3a/0xa0 Aug 23 10:19:31 xen-028 kernel: [590071.750040] [] ? _raw_spin_unlock_irqrestore+0x16/0x20 Aug 23 10:19:31 xen-028 kernel: [590071.750204] [] ? maybe_create_worker+0x120/0x120 Aug 23 10:19:31 xen-028 kernel: [590071.750369] [] kthread+0xe5/0x100 Aug 23 10:19:31 xen-028 kernel: