Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) packages making their way to centos-virt-xen-testing
> -Original Message- > From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of > Johnny Hughes > Sent: Wednesday, January 24, 2018 6:39 AM > To: centos-virt@centos.org > Subject: Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) > packages making their way to centos-virt-xen-testing > > On 01/24/2018 01:01 AM, Pasi Kärkkäinen wrote: > > On Tue, Jan 23, 2018 at 06:20:39PM -0600, Kevin Stange wrote: > >> On 01/23/2018 05:57 PM, Karl Johnson wrote: > >>> > >>> > >>> On Tue, Jan 23, 2018 at 4:50 PM, Nathan March >>> <mailto:nat...@gt.net>> wrote: > >>> > >>> Hi, > >>> > >>> > Hmm.. isn't this the ldisc bug that was discussed a few months ago > on this > >>> list, > >>> > and a patch was applied to virt-sig kernel aswell? > >>> > > >>> > Call trace looks similar.. > >>> > >>> Good memory! I'd forgotten about that despite being the one who ran > >>> into it. > >>> > >>> Looks like that patch was just removed in 4.9.75-30 which I just > >>> upgraded > >>> this system to: http://cbs.centos.org/koji/buildinfo?buildID=21122 > >>> <http://cbs.centos.org/koji/buildinfo?buildID=21122> > >>> Previously I was on 4.9.63-29 which does not have this problem, and > does > >>> have the ldisc patch. So I guess the question is for Johnny, why was > >>> it > >>> removed? > >>> > >>> In the meantime, I'll revert the kernel and follow up if I see any > >>> further > >>> problems. > >>> > >>> > >>> IIRC the patch has been removed from the spec file because it has been > >>> merged upstream in 4.9.71. > >> > >> The IRC discussion I found in my log indicates that it was removed > >> because it didn't apply cleanly due to changes when updating to 4.9.75, > >> yet I don't think anyone independently validated that the changes made > >> are equivalent to the patch that was removed. I was never able to > >> reproduce this issue, so I didn't investigate it myself. > >> > > > > Sounds like the patch is still needed :) > > > > Anyone up to re-porting it to 4.9.75+ ? > > It looked, at first glance, like 4.9.71 fixed it .. I guess not in all cases I'm happy to do testing here if anyone's able to help with a patch, does look like reverting to 4.9.63-29 solved it for me in the interm. ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) packages making their way to centos-virt-xen-testing
Hi, > Hmm.. isn't this the ldisc bug that was discussed a few months ago on this list, > and a patch was applied to virt-sig kernel aswell? > > Call trace looks similar.. Good memory! I'd forgotten about that despite being the one who ran into it. Looks like that patch was just removed in 4.9.75-30 which I just upgraded this system to: http://cbs.centos.org/koji/buildinfo?buildID=21122 Previously I was on 4.9.63-29 which does not have this problem, and does have the ldisc patch. So I guess the question is for Johnny, why was it removed? In the meantime, I'll revert the kernel and follow up if I see any further problems. Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) packages making their way to centos-virt-xen-testing
> Thanks for the heads-up. It's been running through XenServer's tests > as well as the XenProject's "osstest" -- I haven't heard of any > additional issues, but I'll ask. Looks like I can reproduce this pretty easily, this happened upon ssh'ing into the server while I had a VM migrating into it. The system goes completely unresponsive (can't even enter a keystroke via console): [64722.291300] vlan208: port 4(vif5.0) entered forwarding state [64722.291695] NOHZ: local_softirq_pending 08 [64929.006981] BUG: unable to handle kernel paging request at 2260 [64929.007020] IP: [] n_tty_receive_buf_common+0xa4/0x1f0 [64929.007049] PGD 1f7a53067 [64929.007057] PUD 1ee0d4067 PMD 0 [64929.007069] [64929.007077] Oops: [#1] SMP [64929.007088] Modules linked in: ebt_ip6 ebt_ip ebtable_filter ebtables arptable_filter arp_tables bridge xen_pciback xen_gntalloc nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd sunrpc grace 8021q mrp garp stp llc bonding xen_acpi_processor blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd dcdbas fjes pcspkr ipmi_devintf ipmi_si ipmi_msghandler joydev i2c_i801 i2c_smbus lpc_ich shpchp mei_me mei ioatdma ixgbe mdio igb dca ptp pps_core uas usb_storage wmi ttm [64929.007327] CPU: 15 PID: 17696 Comm: kworker/u48:0 Not tainted 4.9.75-30.el6.x86_64 #1 [64929.007343] Hardware name: Dell Inc. PowerEdge C6220/03C9JJ, BIOS 2.7.1 03/04/2015 [64929.007362] Workqueue: events_unbound flush_to_ldisc [64929.007376] task: 8801fbc70580 task.stack: c90048af8000 [64929.007415] RIP: e030:[] [] n_tty_receive_buf_common+0xa4/0x1f0 [64929.007465] RSP: e02b:c90048afbb08 EFLAGS: 00010296 [64929.007476] RAX: 2260 RBX: RCX: 0002 [64929.007519] RDX: RSI: 8801dc0f3c20 RDI: 8801f9b8acd8 [64929.007563] RBP: c90048afbb78 R08: 0001 R09: 8210f1c0 [64929.007577] R10: 7ff0 R11: R12: 0002 [64929.007620] R13: 8801f9b8ac00 R14: R15: 8801dc0f3c20 [64929.007675] FS: 7fcfc0af8700() GS:880204dc() knlGS: [64929.007718] CS: e033 DS: ES: CR0: 80050033 [64929.007759] CR2: 2260 CR3: 0001f067b000 CR4: 00042660 [64929.007782] Stack: [64929.007806] c90048afbb38 8801f9b8acd8 000104dda030 [64929.007858] 2260 fbc72700 880204dc48c0 [64929.007941] 880204dce890 8801dc0f3c00 8801f7f25c00 c90048afbbf8 [64929.007994] Call Trace: [64929.008008] [] n_tty_receive_buf2+0x14/0x20 [64929.008048] [] tty_ldisc_receive_buf+0x23/0x50 [64929.008088] [] flush_to_ldisc+0xc8/0x100 [64929.008133] [] ? __switch_to+0x20b/0x690 [64929.008176] [] ? xen_clocksource_read+0x15/0x20 [64929.008222] [] process_one_work+0x170/0x500 [64929.008268] [] ? __schedule+0x238/0x530 [64929.008310] [] ? schedule+0x3a/0xa0 [64929.008324] [] worker_thread+0x166/0x530 [64929.008368] [] ? put_prev_entity+0x29/0x140 [64929.008412] [] ? __schedule+0x238/0x530 [64929.008458] [] ? default_wake_function+0x12/0x20 [64929.008502] [] ? maybe_create_worker+0x120/0x120 [64929.008518] [] ? schedule+0x3a/0xa0 [64929.008555] [] ? _raw_spin_unlock_irqrestore+0x16/0x20 [64929.008599] [] ? maybe_create_worker+0x120/0x120 [64929.008616] [] kthread+0xe5/0x100 [64929.008630] [] ? schedule_tail+0x56/0xc0 [64929.008643] [] ? __kthread_init_worker+0x40/0x40 [64929.008659] [] ? schedule_tail+0x56/0xc0 [64929.008673] [] ret_from_fork+0x41/0x50 [64929.008685] Code: 89 fe 4c 89 ef 89 45 98 e8 aa fb ff ff 8b 45 98 48 63 d0 48 85 db 48 8d 0c 13 48 0f 45 d9 01 45 bc 49 01 d7 41 29 c4 48 8b 45 b0 <48> 8b 30 48 89 75 c0 49 8b 0e 8d 96 00 10 00 00 29 ca 41 f6 85 [64929.008894] RIP [] n_tty_receive_buf_common+0xa4/0x1f0 [64929.008914] RSP [64929.008923] CR2: 2260 [64929.009641] ---[ end trace e1da1cdf77fed144 ]--- [64929.009785] BUG: unable to handle kernel paging request at ffd8 [64929.009804] IP: [] kthread_data+0x10/0x20 [64929.009823] PGD 200d067 [64929.009831] PUD 200f067 PMD 0 [64929.009842] [64929.009850] Oops: [#2] SMP [64929.009864] Modules linked in: ebt_ip6 ebt_ip ebtable_filter ebtables arptable_filter arp_tables bridge xen_pciback xen_gntalloc nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd sunrpc grace 8021q mrp garp stp llc bonding xen_acpi_processor blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd dcdbas fjes pcspkr ipmi_devintf ipmi_si ipmi_msghandler joydev i2c_i801 i2c_smbus lpc_ich shpchp mei_me mei ioatdma ixgbe mdio igb dca ptp pps_core uas usb_storage wmi ttm [64929.010054] CPU: 15 PID: 17696 Comm: kworker/u48:0 Tainted: G D 4.9.75-30.el6.x86_64 #1 [64929.010068] Hardware name: Dell Inc. PowerEdge C6220/03C9JJ, BIOS 2.7.1 03/04/2015 [64929.010127] task: 8801fbc70580 task.stack: c90048af8000 [64929.010138] RIP: e030:[] [] kthread_data
Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) packages making their way to centos-virt-xen-testing
Just a heads up that I'm seeing major stability problems on these builds. Didn't have console capture setup unfortunately, but have seen my test hypervisor hard lock twice over the weekend. This is with xpti being used, rather than the shim. Cheers, Nathan > -Original Message- > From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of > George Dunlap > Sent: Wednesday, January 17, 2018 9:14 AM > To: Discussion about the virtualization on CentOS > Subject: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) packages > making their way to centos-virt-xen-testing > > I've built & tagged packages for CentOS 6 and 7 4.6.6-9, with XPTI > "stage 1" Meltdown mitigation. > > This will allow 64-bit PV guests to run safely (with a few caveats), > but incurs a fairly significant slowdown for 64-bit PV guests on Intel > boxes (including domain 0). > > If you prefer using Vixen / Comet, you can turn it off by adding > 'xpti=0' to your Xen command-line. > > Detailed information can be found in the XSA-254 advisory: > > https://xenbits.xen.org/xsa/advisory-254.html > > Please test and report any issues you have. I'll probably tag then > with -release tomorrow. > > 4.8 packages should be coming to buildlogs soon. > > -George > ___ > CentOS-virt mailing list > CentOS-virt@centos.org > https://lists.centos.org/mailman/listinfo/centos-virt ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) packages making their way to centos-virt-xen-testing
> -Original Message- > From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of > Peter Peltonen > Sent: Thursday, January 18, 2018 11:19 AM > To: Discussion about the virtualization on CentOS > Subject: Re: [CentOS-virt] Xen 4.6.6-9 (with XPTI meltdown mitigation) > packages making their way to centos-virt-xen-testing > > Thanks George. > > As there are now quite many options to choose from, what would be the > best option performance wise for running 32bit domUs under xen-4.6? > > Best, > Peter > It's worth taking a look at the table in the latest XSA, it helps clarify a fair bit: https://xenbits.xen.org/xsa/advisory-254.html Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Stability issues since moving to 4.6 - Kernel paging request bug + VM left in null state
Since moving from 4.4 to 4.6, I've been seeing an increasing number of stability issues on our hypervisors. I'm not clear if there's a singular root cause here, or if I'm dealing with multiple bugs. One of the more common ones I've seen, is a VM on shutdown will remain in the null state and a kernel bug is thrown: xen001 log # xl list NameID Mem VCPUs State Time(s) Domain-0 0 614424 r- 6639.7 (null) 3 0 1 --pscd 36.3 [89920.839074] BUG: unable to handle kernel paging request at 88020ee9a000 [89920.839546] IP: [] __memcpy+0x12/0x20 [89920.839933] PGD 2008067 [89920.840022] PUD 17f43f067 [89920.840390] PMD 1e0976067 [89920.840469] PTE 0 [89920.840833] [89920.841123] Oops: [#1] SMP [89920.841417] Modules linked in: ebt_ip ebtable_filter ebtables arptable_filter arp_tables bridge xen_pciback xen_gntalloc nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd sunrpc grace 8021q mrp garp stp llc bonding xen_acpi_processor blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd dcdbas fjes pcspkr ipmi_devintf ipmi_si ipmi_msghandler joydev i2c_i801 i2c_smbus lpc_ich shpchp mei_me mei ioatdma ixgbe mdio igb dca ptp pps_core uas usb_storage wmi ttm [89920.847080] CPU: 4 PID: 1471 Comm: loop6 Not tainted 4.9.58-29.el6.x86_64 #1 [89920.847381] Hardware name: Dell Inc. PowerEdge C6220/03C9JJ, BIOS 2.7.1 03/04/2015 [89920.847893] task: 8801b75e0700 task.stack: c900460e [89920.848192] RIP: e030:[] [] __memcpy+0x12/0x20 [89920.848783] RSP: e02b:c900460e3b20 EFLAGS: 00010246 [89920.849081] RAX: 88018916d000 RBX: 8801b75e0700 RCX: 0200 [89920.849384] RDX: RSI: 88020ee9a000 RDI: 88018916d000 [89920.849686] RBP: c900460e3b38 R08: 88011da9fcf8 R09: 0002 [89920.849989] R10: 88019535bddc R11: ea0006245b5c R12: 1000 [89920.850294] R13: 88018916e000 R14: 1000 R15: c900460e3b68 [89920.850605] FS: 7fb865c30700() GS:880204b0() knlGS: [89920.851118] CS: e033 DS: ES: CR0: 80050033 [89920.851418] CR2: 88020ee9a000 CR3: 0001ef03b000 CR4: 00042660 [89920.851720] Stack: [89920.852009] 814375ca c900460e3b38 c900460e3d08 c900460e3bb8 [89920.852821] 814381c5 c900460e3b68 c900460e3d08 1000 [89920.853633] c900460e3d88 1000 ea00 [89920.854445] Call Trace: [89920.854741] [] ? memcpy_from_page+0x3a/0x70 [89920.855043] [] iov_iter_copy_from_user_atomic+0x265/0x290 [89920.855354] [] generic_perform_write+0xf3/0x1d0 [89920.855673] [] ? xen_load_tls+0xaa/0x160 [89920.855992] [] nfs_file_write+0xdb/0x200 [nfs] [89920.856297] [] vfs_iter_write+0xa2/0xf0 [89920.856599] [] lo_write_bvec+0x65/0x100 [89920.856899] [] do_req_filebacked+0x195/0x300 [89920.857202] [] loop_queue_work+0x5b/0x80 [89920.857505] [] kthread_worker_fn+0x98/0x1b0 [89920.857808] [] ? schedule+0x3a/0xa0 [89920.858108] [] ? _raw_spin_unlock_irqrestore+0x16/0x20 [89920.858411] [] ? kthread_probe_data+0x40/0x40 [89920.858713] [] kthread+0xe5/0x100 [89920.859014] [] ? __kthread_init_worker+0x40/0x40 [89920.859317] [] ret_from_fork+0x25/0x30 [89920.859615] Code: 81 f3 00 00 00 00 e9 1e ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3 [89920.864410] RIP [] __memcpy+0x12/0x20 [89920.864749] RSP [89920.865021] CR2: 88020ee9a000 [89920.865294] ---[ end trace b77d2ce5646284d1 ]--- Wondering if anyone has advice on how to troubleshoot the above, or might have some insight into that the issue could be? This hypervisor was only up for a day, had almost no VMs running on it since boot, I booted a single windows test VM which BSOD'ed and then this happened. This is on xen 4.6.6-4.el6 with 4.9.58-29.el6.x86_64. I see these issues across a wide number of systems with from both Dell and Supermicro, although we run the same Intel x540 10gb nic's in each system with the same netapp nfs backend storage. Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Status of reverted Linux patch "tty: Fix ldisc crash on reopened tty", Linux 4.9 kernel frequent crashes
> > I have no issues rolling this patch in , while we wait on upstream, if > > it makes our tree more stable. > > > > I think we should do that.. What do others think? > I've had the patch deployed to a group of 32 hosts (with hundreds of vms) for about 10 days now and no sign of any issues. So I support it =) Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Major stability problems with xen 4.6.6
> It seems the patch you mentioned was merged to upstream Linux here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > d=71472fa9c52b1da27663c275d416d8654b905f05 > > and then reverted/removed here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > d=896d81fefe5d1919537db2c2150ab6384e4a6610 > > Do you know if there has been proper/fixed patch after that? has it been > merged to upstream Linux kernel already? Interesting! I didn't come across that when digging into this. It looks like this hasn't been followed up on at all since April: https://lists.gt.net/engine?list=linux;do=search_results;search_type=AND;sea rch_forum=forum_1;search_string=ldisc%20reopened&sb=post_time Currently I've got ~40 dom0's running with the patch on 4.9.44-39 and it's resolved all stability issues, previously I was seeing multiple crashes a week. Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Major stability problems with xen 4.6.6
Just in case anyone else on this list is running into similar issues, I can confirm that the patch appears to have resolved this. I've opened https://bugs.centos.org/view.php?id=13713 It was so bad that having the system under load (with rpmbuild) and opening another ssh window or two would almost always cause the oops. Cheers, Nathan From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 3:32 PM To: 'Discussion about the virtualization on CentOS' Subject: Re: [CentOS-virt] Major stability problems with xen 4.6.6 This appears to be a centos kernel issue rather than a xen one. https://lkml.org/lkml/2016/5/17/440 Digging through the posts and not clear why this never made it upstream. I'm going to apply that patch to my systems and see if it resolves, but won't know for certain until a week or two of stability goes by. - Nathan From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 2:48 PM To: centos-virt@centos.org <mailto:centos-virt@centos.org> Subject: [CentOS-virt] Major stability problems with xen 4.6.6 Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Major stability problems with xen 4.6.6
This appears to be a centos kernel issue rather than a xen one. https://lkml.org/lkml/2016/5/17/440 Digging through the posts and not clear why this never made it upstream. I'm going to apply that patch to my systems and see if it resolves, but won't know for certain until a week or two of stability goes by. - Nathan From: CentOS-virt [mailto:centos-virt-boun...@centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 2:48 PM To: centos-virt@centos.org Subject: [CentOS-virt] Major stability problems with xen 4.6.6 Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Major stability problems with xen 4.6.6
Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan Aug 23 10:19:31 xen-028 kernel: [590071.735515] BUG: unable to handle kernel paging request at 2260 Aug 23 10:19:31 xen-028 kernel: [590071.735795] IP: [] n_tty_receive_buf_common+0xa4/0x1f0 Aug 23 10:19:31 xen-028 kernel: [590071.736031] PGD 0 Aug 23 10:19:31 xen-028 kernel: [590071.736083] Aug 23 10:19:31 xen-028 kernel: [590071.736300] Oops: [#1] SMP Aug 23 10:19:31 xen-028 kernel: [590071.736470] Modules linked in: ebt_ip6 ebt_ip ebtable_filter ebtables arptable_filter arp_tables bridge xen_pciback xen_gntalloc nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd sunrpc grace 8021q mrp garp stp llc bonding blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd ipmi_devintf ipmi_si ipmi_msghandler gpio_ich iTCO_wdt iTCO_vendor_support fjes acpi_power_meter dcdbas pcspkr serio_raw joydev lpc_ich igb ixgbe dca ptp pps_core mdio i7core_edac edac_core bnx2 raid1 megaraid_sas ttm Aug 23 10:19:31 xen-028 kernel: [590071.740051] CPU: 14 PID: 21615 Comm: kworker/u48:1 Not tainted 4.9.39-29.el6.x86_64 #1 Aug 23 10:19:31 xen-028 kernel: [590071.740330] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.0.7 08/18/2011 Aug 23 10:19:31 xen-028 kernel: [590071.740607] Workqueue: events_unbound flush_to_ldisc Aug 23 10:19:31 xen-028 kernel: [590071.740806] task: 88008a6011c0 task.stack: c9004cfec000 Aug 23 10:19:31 xen-028 kernel: [590071.740966] RIP: e030:[] [] n_tty_receive_buf_common+0xa4/0x1f0 Aug 23 10:19:31 xen-028 kernel: [590071.741282] RSP: e02b:c9004cfefb08 EFLAGS: 00010296 Aug 23 10:19:31 xen-028 kernel: [590071.741442] RAX: 2260 RBX: RCX: 000a Aug 23 10:19:31 xen-028 kernel: [590071.741714] RDX: RSI: 88015ecd6420 RDI: 8800afd654d8 Aug 23 10:19:31 xen-028 kernel: [590071.741994] RBP: c9004cfefb78 R08: 0001 R09: 81f0af00 Aug 23 10:19:31 xen-028 kernel: [590071.742274] R10: 7ff0 R11: 0078 R12: 000a Aug 23 10:19:31 xen-028 kernel: [590071.742549] R13: 8800afd65400 R14: R15: 88015ecd6420 Aug 23 10:19:31 xen-028 kernel: [590071.742830] FS: 7f81da7317c0() GS:8801c098() knlGS: Aug 23 10:19:31 xen-028 kernel: [590071.743112] CS: e033 DS: ES: CR0: 80050033 Aug 23 10:19:31 xen-028 kernel: [590071.743283] CR2: 2260 CR3: 8f61f000 CR4: 2660 Aug 23 10:19:31 xen-028 kernel: [590071.743564] Stack: Aug 23 10:19:31 xen-028 kernel: [590071.743719] c900116c 8800afd654d8 0001c070 Aug 23 10:19:31 xen-028 kernel: [590071.744149] 2260 8a603340 8801c0997000 Aug 23 10:19:31 xen-028 kernel: [590071.744577] 8801c098b890 88015ecd6400 8800b19e9c00 c9004cfefbf8 Aug 23 10:19:31 xen-028 kernel: [590071.745008] Call Trace: Aug 23 10:19:31 xen-028 kernel: [590071.745169] [] n_tty_receive_buf2+0x14/0x20 Aug 23 10:19:31 xen-028 kernel: [590071.745335] [] tty_ldisc_receive_buf+0x23/0x50 Aug 23 10:19:31 xen-028 kernel: [590071.745501] [] flush_to_ldisc+0xc8/0x100 Aug 23 10:19:31 xen-028 kernel: [590071.745669] [] ? __switch_to+0x1dc/0x680 Aug 23 10:19:31 xen-028 kernel: [590071.745836] [] process_one_work+0x170/0x500 Aug 23 10:19:31 xen-028 kernel: [590071.746005] [] ? __schedule+0x238/0x530 Aug 23 10:19:31 xen-028 kernel: [590071.746169] [] ? maybe_create_worker+0x94/0x120 Aug 23 10:19:31 xen-028 kernel: [590071.746342] [] ? schedule+0x3a/0xa0 Aug 23 10:19:31 xen-028 kernel: [590071.746506] [] worker_thread+0x166/0x580 Aug 23 10:19:31 xen-028 kernel: [590071.746671] [] ? __schedule+0x238/0x530 Aug 23 10:19:31 xen-028 kernel: [590071.749537] [] ? default_wake_function+0x12/0x20 Aug 23 10:19:31 xen-028 kernel: [590071.749706] [] ? maybe_create_worker+0x120/0x120 Aug 23 10:19:31 xen-028 kernel: [590071.749872] [] ? schedule+0x3a/0xa0 Aug 23 10:19:31 xen-028 kernel: [590071.750040] [] ? _raw_spin_unlock_irqrestore+0x16/0x20 Aug 23 10:19:31 xen-028 kernel: [590071.750204] [] ? maybe_create_worker+0x120/0x120 Aug 23 10:19:31 xen-028 kernel: [590071.750369] [] kthread+0xe5/0x100 Aug 23 10:19:31 xen-028 kernel:
[CentOS-virt] Xen packages with XSA-226+?
Hi, It's been almost a week now since XSA-226 through XSA-230 were released and just wondering when updated packages are expected to be posted? https://cbs.centos.org/koji/packageinfo?packageID=88 has nothing for the past month. Thanks! - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Xen Doc Day: Guide to setting up bridging on CentOS 6 / 7
If you'd like to extend that a little bit, here's example configs on how to do LACP and vlan tagging on c6: host network-scripts # cat ifcfg-eth0 DEVICE=eth0 ONBOOT=yes USEERCTL=no BOOTPROTO=none IPV6INIT=no MTU=1500 MASTER=bond0 SLAVE=yes host network-scripts # cat ifcfg-eth1 DEVICE=eth1 ONBOOT=yes USEERCTL=no BOOTPROTO=none IPV6INIT=no MTU=1500 MASTER=bond0 SLAVE=yes host network-scripts # cat ifcfg-bond0 DEVICE=bond0 ONBOOT=yes USEERCTL=no BOOTPROTO=none IPV6INIT=no BONDING_OPTS="miimon=100 mode=802.3ad" host network-scripts # cat ifcfg-vlan### DEVICE=vlan### ONBOOT=yes USEERCTL=no BOOTPROTO=none IPV6INIT=no PHYSDEV=bond0 VLAN=yes VLAN_NAME_TYPE=VLAN_PLUS_VID_NO_PAD IPADDR=10.x.x.x NETMASK=255.255.255.0 GATEWAY=10.x.x.x DOMAIN="example.com" DNS1=x.x.x.x DNS2=x.x.x.x DNS3=x.x.x.x I use my own bridging control scripts, but this should extend your existing doc nicely just by using BRIDGE=xenbr0 in the ifcfg-vlan### file. Also there's a small error in your c6 doc, you specify ifcfg-$dev but $dev never gets set anywhere. - Nathan > -Original Message- > From: centos-virt-boun...@centos.org [mailto:centos-virt- > boun...@centos.org] On Behalf Of George Dunlap > Sent: Wednesday, October 28, 2015 10:02 AM > To: Discussion about the virtualization on CentOS > Subject: [CentOS-virt] Xen Doc Day: Guide to setting up bridging on CentOS 6 > / 7 > > In honor of Xen Doc Day, I've put up some basic HOWTOs for setting up > bridging on CentOS 6 and 7. I'm far from an expert, so I'd appreciate any > feedback. > > The howtos can be found here: > > https://wiki.centos.org/HowTos/Xen/Xen4QuickStart/Xen4Networking6 > > https://wiki.centos.org/HowTos/Xen/Xen4QuickStart/Xen4Networking7 > > -George > ___ > CentOS-virt mailing list > CentOS-virt@centos.org > https://lists.centos.org/mailman/listinfo/centos-virtv ___ CentOS-virt mailing list CentOS-virt@centos.org https://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Kernel oops on the dom0
Hi All, One of our developers managed to trigger a kernel oops on a 4.4.2 dom0.. Oops text is attached. He was working on setting up network namespaces / bridging inside a centos domU, had activated the bridge and lost networking (probably config error) so he rebooted the VM. On reboot is when we saw the oops, along with various xen procs hanging: root 23388 0.0 1.6 132796 32264 ?SLsl Oct02 0:04 /usr/sbin/xl create /mnt/xen/gx/xen/metrixc7 root 26119 0.0 0.0 0 0 ?ZOct02 0:00 \_ [block] root 26127 0.0 0.0 0 0 ?ZOct02 0:00 \_ [block] root 26137 0.0 0.0 0 0 ?ZOct02 0:00 \_ [block] root 26157 0.0 0.0 0 0 ?ZOct02 0:00 \_ [block] root 26169 0.0 0.0 0 0 ?ZOct02 0:00 \_ [block] root 26195 0.0 0.0 0 0 ?ZOct02 0:00 \_ [vif-bridge] root 24625 0.0 0.0 0 0 ?Ds Oct02 0:06 [tapdisk] At this point the dom0 is still up and running existing VMs fine, I can also migrate live VMs off of it successfully although the post-migration clean up fails and hangs: libxl: error: libxl_device.c:935:device_backend_callback: unable to remove device with path /local/domain/0/backend/vbd/17/51712 Host is running centos 7 with the 4.4.2-7 package and kernel 3.10.68-11.el6.centos.alt.x86_64. I've also attached xl dmesg. First time I've seen anything like this and not sure if his networking/bridging in the domU is related or just coincidental. Any thoughts / ideas? Going to try to reproduce on a test dom0 later this week, so happy to grab any additional debugging if required. - Nathan Oct 2 18:27:02 vana-031 kernel: BUG: unable to handle kernel paging request at 88006564b000 Oct 2 18:27:02 vana-031 kernel: IP: [] memcpy+0x6/0x110 Oct 2 18:27:02 vana-031 kernel: PGD 1c0d067 PUD 104e0c067 PMD 104ce0067 PTE 0 Oct 2 18:27:02 vana-031 kernel: Oops: 0002 [#1] SMP Oct 2 18:27:02 vana-031 kernel: Modules linked in: ebt_ip6 tun ebt_ip ebtable_filter ebtables arptable_filter arp_tables bridge xen_pciback xen_gntalloc nfsd auth_rpcgss nfsv3 nfs_acl nfs fscache lockd sunrpc 8021q garp stp llc bond ing ipv6 xen_acpi_processor blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd iTCO_wdt iTCO_vendor_support dcdbas coretemp freq_table mperf crc32_pclmul crc32c_intel ghash_clmulni_intel microcode pcspkr ses enclosure sg ipmi_devintf ipmi_si ipmi_msghandler joydev i2c_i801 lpc_ich shpchp ixgbe mdio igb hwmon ptp pps_core ioatdma dca ext4 jbd2 mbcache raid1 sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 mpt2sas sc si_transport_sas raid_class ahci libahci wmi ttm drm_kms_helper dm_mirror dm_region_hash dm_log dm_mod Oct 2 18:27:02 vana-031 kernel: CPU: 18 PID: 24625 Comm: tapdisk Not tainted 3.10.68-11.el6.centos.alt.x86_64 #1 Oct 2 18:27:02 vana-031 kernel: Hardware name: Dell Inc. PowerEdge C6220 II/09N44V, BIOS 2.6.0 10/09/2014 Oct 2 18:27:02 vana-031 kernel: task: 880004ed2780 ti: 880024266000 task.ti: 880024266000 Oct 2 18:27:02 vana-031 kernel: RIP: e030:[] [] memcpy+0x6/0x110 Oct 2 18:27:02 vana-031 kernel: RSP: e02b:880024267c90 EFLAGS: 00010202 Oct 2 18:27:02 vana-031 kernel: RAX: 88006564b000 RBX: 88004104f830 RCX: 1000 Oct 2 18:27:02 vana-031 kernel: RDX: 1000 RSI: 88007c0ca000 RDI: 88006564b000 Oct 2 18:27:02 vana-031 kernel: RBP: 880024267cd8 R08: R09: Oct 2 18:27:02 vana-031 kernel: R10: R11: 880024267eb8 R12: 0001 Oct 2 18:27:02 vana-031 kernel: R13: 8800 R14: 6db6db6db6db6db7 R15: 1600 Oct 2 18:27:02 vana-031 kernel: FS: 7f62c755e740() GS:88010144() knlGS:8801014e Oct 2 18:27:02 vana-031 kernel: CS: e033 DS: ES: CR0: 80050033 Oct 2 18:27:02 vana-031 kernel: CR2: 88006564b000 CR3: 3533b000 CR4: 00042660 Oct 2 18:27:02 vana-031 kernel: DR0: DR1: DR2: Oct 2 18:27:02 vana-031 kernel: DR3: DR6: 0ff0 DR7: 0400 Oct 2 18:27:02 vana-031 kernel: Stack: Oct 2 18:27:02 vana-031 kernel: a030f42a 664b4800 880024267cc8 Oct 2 18:27:02 vana-031 kernel: 88004104f830 8800664b4800 8800664b4800 Oct 2 18:27:02 vana-031 kernel: 8800664b4820 880024267cf8 a030e1f7 88004104f830 Oct 2 18:27:02 vana-031 kernel: Call Trace: Oct 2 18:27:02 vana-031 kernel: [] ? blktap_request_bounce+0xda/0x100 [blktap] Oct 2 18:27:02 vana-031 kernel: [] blktap_ring_unmap_request+0x67/0x90 [blktap] Oct 2 18:27:02 vana-031 kernel: [] blktap_device_end_request+0x32/0x90 [blktap] Oct 2 18:27:02 vana-031 k
Re: [CentOS-virt] Timezone issues with migrations between host kernel 3.10 and 3.18
> -Original Message- > From: centos-virt-boun...@centos.org [mailto:centos-virt- > boun...@centos.org] On Behalf Of Johnny Hughes > Sent: Thursday, July 30, 2015 4:41 AM > To: centos-virt@centos.org > Subject: Re: [CentOS-virt] Timezone issues with migrations between host > kernel 3.10 and 3.18 > > On 07/30/2015 06:38 AM, Johnny Hughes wrote: > > On 07/29/2015 11:38 AM, Nathan March wrote: > >> Hi All, > >> > >> > >> > >> I'm seeing clock issues with live migrations on the latest kernel > >> packages, migrating a VM from 3.10.68-11 to 3.18.17-13 results in the > >> VM clock being off by 7 hours (I'm PST, so appears to be a timezone > issue). > >> This is also between xen versions, but rolling the target back to > >> 3.10 resolved so don't believe the recent XSA's are related. > >> > >> > >> > >> Anyone else seen behavior like this or have any ideas on how to resolve? > > > > Some versions of CentOS have a hardware clock uses UTC check box. If > > that is on, and if your hardware clock is instead set to local time, > > that can cause issues. > > > > Can you check that there is no UTC=True in /etc/sysconfig/clock > > also, you can use tzselect to make sure the correct timezone is used. > I considered that so I did use hwclock to confirm that the hardware clock / system clock were both the same between the two servers, and comparing /etc/adjtime between the two indicates the hwclock should have been local time on both sides (unless this got changed when I downgraded the kernel to resolve the issue). Unfortunately I don't have a test machine for this at the moment, but can follow up in a couple weeks. - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Timezone issues with migrations between host kernel 3.10 and 3.18
Hi All, I'm seeing clock issues with live migrations on the latest kernel packages, migrating a VM from 3.10.68-11 to 3.18.17-13 results in the VM clock being off by 7 hours (I'm PST, so appears to be a timezone issue). This is also between xen versions, but rolling the target back to 3.10 resolved so don't believe the recent XSA's are related. Anyone else seen behavior like this or have any ideas on how to resolve? - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] CentOS Images on AWS with partitions on /dev/xvda1 are awkwared to resize
> > So you're working from the command line tools in the EPEL 'cloud-init' > > package, not the AWS GUI? Because when I tried expanding the size of > > the base disk image in the GUI, I wound up with an an 8 Gig default > > /dev/xvda1 on a 20 Gig /dev/xvda. That's why I was looking at "how do > > I resize this thing safel?" No experience with amazon here, but I routinely resize filesystems online without issues. Repartition xvda so that xvda1 is the size you want (make sure you use the same start sector, just change the end of the partition). Run partprobe and confirm that fdisk -l /dev/xvda1 shows the new size. (You may need to reboot). After that just run resize2fs /dev/xvda1 (works online). - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Seeing dropped packets / tcp retrans on latest 4.4.1-10el6
Hi All, I've tracked this down... We do rate limiting of our vms with a mix of ebtables/tc. Running these commands (replace vif1.0 with the correct vif for your VM) will reproduce this: ebtables -A FORWARD -i vif1.0 -j mark --set-mark 990 --mark-target CONTINUE tc qdisc add dev bond0 root handle 1: htb default 2 tc class add dev bond0 parent 1: classid 1:0 htb rate 1mbit tc class add dev bond0 parent 1: classid 1:990 htb rate 1mbit tc filter add dev bond0 protocol ip parent 1:0 prio 990 handle 990 fw flowid 1:990 Note that the speed limits being applied here are 10gb and I'm testing this on a 1gb network, so TC shouldn't really be doing anything here except letting the packets through. These same commands worked fine on gentoo xen 4.1 / kernel 3.2.57, compared to this now not working on centos xen 4.4.1 / kernel 3.10.68. Easiest way to reproduce is simply generate a large file, scp it to a remote host and on the remote host run: tshark -Y "tcp.analysis.duplicate_ack_num" If you run the ssh in a loop + tshark in another window, you can see the Dup ACK's begin immediately after adding the last filter rule: 25790294 1752.756733 xxx.xxx.xxx.13 -> xxx.xxx.xxx.205 TCP 78 [TCP Dup ACK 25790286#4] ssh > 51515 [ACK] Seq=15994 Ack=50769840 Win=1544704 Len=0 TSval=738150929 TSecr=4294944346 SLE=50785768 SRE=50790596 25790296 1752.756742 xxx.xxx.xxx.13 -> xxx.xxx.xxx.205 TCP 78 [TCP Dup ACK 25790286#5] ssh > 51515 [ACK] Seq=15994 Ack=50769840 Win=1544704 Len=0 TSval=738150929 TSecr=4294944346 SLE=50785768 SRE=50792044 - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Seeing dropped packets / tcp retrans on latest 4.4.1-10el6
So I might have been misinterpreting things here and might be way off base. I think you can ignore this thread and I'll follow up if I get anything concrete down the road =) The retranmissions I'm seeing and reproducing are probably within normal allowances and can't reproduce the issue that originally lead me down this path. - Nathan > -Original Message- > From: centos-virt-boun...@centos.org [mailto:centos-virt- > boun...@centos.org] On Behalf Of Nathan March > Sent: Wednesday, April 15, 2015 1:13 PM > To: 'Discussion about the virtualization on CentOS' > Subject: Re: [CentOS-virt] Seeing dropped packets / tcp retrans on latest > 4.4.1-10el6 > > Hi All, > > Some more data on this, I've reproduced this on another host that's a > completely stock centos/xen deployment with a centos 6.6 domU. > > Since I’m seeing the retransmissions on the VIF, I don't think it's related to > the network stack but just in case.. Each host is connected via LACP with vlan > tagging to a pair of stacked cisco 3750's. Host networking config is here: > > http://dpaste.com/1Q6NY3Y > > The vm is on br99 here. > > This is easily reproducable by just generating a 250mb random file and doing > an scp, while watching with tshark: > > tshark -R "tcp.analysis.retransmission" > > There's no visible impact to the connection the vast majority of the time, > which is why I think this has gone unnoticed. > > Just to confirm this wasn't related to hardware / nics, I've reproduced this > on: > > - Dell PowerEdge M620 with broadcom nics > - Dell C6220 with intel nics > - Supermicro X8DTT with intel nics > > Any ideas? =) > > - Nathan > > ___ > CentOS-virt mailing list > CentOS-virt@centos.org > http://lists.centos.org/mailman/listinfo/centos-virt ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Seeing dropped packets / tcp retrans on latest 4.4.1-10el6
Hi All, Some more data on this, I've reproduced this on another host that's a completely stock centos/xen deployment with a centos 6.6 domU. Since I’m seeing the retransmissions on the VIF, I don't think it's related to the network stack but just in case.. Each host is connected via LACP with vlan tagging to a pair of stacked cisco 3750's. Host networking config is here: http://dpaste.com/1Q6NY3Y The vm is on br99 here. This is easily reproducable by just generating a 250mb random file and doing an scp, while watching with tshark: tshark -R "tcp.analysis.retransmission" There's no visible impact to the connection the vast majority of the time, which is why I think this has gone unnoticed. Just to confirm this wasn't related to hardware / nics, I've reproduced this on: - Dell PowerEdge M620 with broadcom nics - Dell C6220 with intel nics - Supermicro X8DTT with intel nics Any ideas? =) - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Seeing dropped packets / tcp retrans on latest 4.4.1-10el6
Hi All, Was troubleshooting some odd VM network issues and discovered that we're seeing dropped packets + retransmissions across multiple domU OS's and dom0 hardware platforms. xendev01 ~ # tshark -R "tcp.analysis.retransmission " -i vif7.0 Running as user "root" and group "root". This could be dangerous. Capturing on vif7.0 3.054257 xxx.xxx.xxx.196 -> xxx.xxx.xxx.145 SSH 110 [TCP Fast Retransmission] Encrypted response packet len=44 3.061949 xxx.xxx.xxx.196 -> xxx.xxx.xxx.145 SSH 1434 [TCP Fast Retransmission] Encrypted response packet len=1368 3.383880 xxx.xxx.xxx.196 -> xxx.xxx.xxx.145 SSH 1434 [TCP Fast Retransmission] Encrypted response packet len=1368 3.630911 xxx.xxx.xxx.196 -> xxx.xxx.xxx.145 SSH 1434 [TCP Fast Retransmission] Encrypted response packet len=1368 3.635964 xxx.xxx.xxx.196 -> xxx.xxx.xxx.145 SSH 1434 [TCP Fast Retransmission] Encrypted response packet len=1368 I've confirmed this is happening with linux, windows and pfsense (bsd) domU's. I've turned off every feature I can with ethtool on both the underlying bridge on the host, the vif's, and the eth's inside the domU's. I also see it on traffic inbetween vms on the same host. The domU sees packet errors on incoming traffic and outgoing looks fine, dumping on the dom0 indicates incoming packets are fine, but the reply from the domU is broken. This does not happen running the exact same VMs on some older xen 4.1.3 hosts. Reproduction is easy (for me at least), any burst of traffic will do it. I've just been running "ps auxf" over ssh to a vm to trigger. Since I'm seeing it on the host when I sniff the vif, this feels like a bug? - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Can't block-attach a file on a read only volume?
> http://cbs.centos.org/kojifiles/work/tasks/8801/8801/ > > If you could test those and let me know if it fixes your problem, I'd > appreciate it. :-) Confirmed, both issues are fixed. Thanks! Any plans to push those packages to main mirrors? - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Can't block-attach a file on a read only volume?
Hi All, One more weird issue, this works on old xen but fails on 4.4: xendev01 ~ # mkdir /mnt/test xendev01 ~ # mount -t tmpfs - /mnt/test xendev01 ~ # dd if=/dev/null of=/mnt/test/disk seek=100M bs=1 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000201809 s, 0.0 kB/s xendev01 ~ # /usr/sbin/xl block-attach nathannx "file:/mnt/test/disk" "xvdd4" DEBUG libxl__blktap_devpath 37 aio:/mnt/test/disk DEBUG libxl__blktap_devpath 40 /dev/xen/blktap-2/tapdev20 xendev01 ~ # xl block-detach nathannx 51764 DEBUG libxl__device_destroy_tapdisk 66 type=aio:/mnt/test/disk disk=:/mnt/test/disk xendev01 ~ # mount -o remount,ro /mnt/test xendev01 ~ # /usr/sbin/xl block-attach nathannx "file:/mnt/test/disk" "xvdd4" DEBUG libxl__blktap_devpath 37 aio:/mnt/test/disk libxl: error: libxl.c:2149:device_disk_add: failed to get blktap devpath for 0xd3abd0 libxl: error: libxl.c:1727:device_addrm_aocomplete: unable to (null) device libxl_device_disk_add failed. I'm not sure why xen would care if the disk is writable? Would be nice to be able to mount these since many NFS storage arrays provide read only access to snapshots. - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Tapdisk processes being left behind when hvm domu's migrate/shutdown
Hi All, I'm seeing tapdisk processes not being terminated after a HVM vm is shutdown or migrated away. I don't see this problem with linux paravirt domu's, just windows hvm ones. xl.cfg: name = 'nathanwin' memory = 4096 vcpus = 2 disk = [ 'file:/mnt/gtc_disk_p1/nathanwin/drive_c,hda,w' ] vif = [ 'mac=00:16:3D:01:03:E0,bridge=vlan208' ] builder = "hvm" kernel = "/usr/lib/xen/boot/hvmloader" localtime = 0 on_poweroff = "destroy" on_reboot = "restart" on_crash = "destroy" vnc = 1 vncunused = 1 cpuid = [ '0:eax=1011', '1:eax=001001101110,ecx=101110111010001000100011,edx=0001000010111011', '2:eax=01010101001101011011', '7,0:eax=,ebx=,ecx=,edx=', '13,1:eax=xxx0', '10:ebx=', '11:edx=', '2147483650:eax=01100101011101000110111001001001,ebx=0010100101010010001011101100,ecx=01100110010101011010,edx=0010100101010010001011101110', '2147483651:eax=01010101010101110010,ebx=0010001000100010,ecx=0010001000100010,edx=0100111000100010', '2147483652:eax=001100110111011000110101,ebx=001001100010,ecx=00110111001100100010111000110010,edx=0010010011000111', '2147483656:eax=001100101000', ] Starting with the VM running initially on another host, I migrate it in: migration target: Ready to receive domain. Saving to migration stream new xl format (info 0x0/0x0/1450) Loading new save file (new xl fmt info 0x0/0x0/1450) Savefile contains xl domain config WARNING: ignoring "kernel" directive for HVM guest. Use "firmware_override" instead if you really want a non-default firmware xc: progress: Reloading memory pages: 56320/11141935% xc: progress: Reloading memory pages: 1003520/1114193 90% DEBUG libxl__blktap_devpath 37 aio:/mnt/gtc_disk_p1/nathanwin/drive_c DEBUG libxl__blktap_devpath 40 /dev/xen/blktap-2/tapdev0 DEBUG libxl__blktap_devpath 37 aio:/mnt/gtc_disk_p1/nathanwin/drive_c DEBUG libxl__blktap_devpath 40 /dev/xen/blktap-2/tapdev2 migration target: Transfer complete, requesting permission to start domain. migration sender: Target has acknowledged transfer. migration sender: Giving target permission to start. migration target: Got permission, starting domain. migration target: Domain started successsfully. migration sender: Target reports successful startup. DEBUG libxl__device_destroy_tapdisk 66 type=aio:/mnt/gtc_disk_p1/nathanwin/drive_c disk=:/mnt/gtc_disk_p1/nathanwin/drive_c Migration successful. and now I have 2 tapdisk procs: gtc-vana-005 ~ # ps auxf | grep tapdisk root 32491 0.1 0.2 20364 4636 ?SLs 11:06 0:00 tapdisk root 32520 0.0 0.2 20364 4636 ?SLs 11:06 0:00 tapdisk Which seems odd given that the VM in question only has a single disk attached to it and the qemu proc indicates it's using tapdev2: root 32524 0.4 0.7 323208 15040 ?SLsl 11:06 0:00 /usr/lib/xen/bin/qemu-system-i386 -xen-domid 3 -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-3,server,nowait -mon chardev=libxl-cmd,mode=control -nodefaults -name nathanwin--incoming -vnc 127.0.0.1:0,to=99 -device cirrus-vga -global vga.vram_size_mb=8 -boot order=cda -smp 2,maxcpus=2 -device rtl8139,id=nic0,netdev=net0,mac=00:16:3d:01:03:e0 -netdev type=tap,id=net0,ifname=vif3.0-emu,script=no,downscript=no -incoming fd:13 -machine xenfv -m 4088 -drive file=/dev/xen/blktap-2/tapdev2,if=ide,index=0,media=disk,format=raw,cache=writeback gtc-vana-005 ~ # lsof -p 32520 | grep blktap-2 tapdisk 32520 root memCHR 246,2 886671 /dev/xen/blktap-2/blktap2 tapdisk 32520 root 19u CHR 246,2 0t0 886671 /dev/xen/blktap-2/blktap2 gtc-vana-005 ~ # lsof -p 32491 | grep blktap-2 tapdisk 32491 root memCHR 246,0 903999 /dev/xen/blktap-2/blktap0 tapdisk 32491 root 14u CHR 246,0 0t0 903999 /dev/xen/blktap-2/blktap0 I then migrate this VM off to another host: migration target: Ready to receive domain. Saving to migration stream new xl format (info 0x0/0x0/1450) Loading new save file (new xl fmt info 0x0/0x0/1450) Savefile contains xl domain config WARNING: ignoring "kernel" directive for HVM guest. Use "firmware_override" instead if you really want a non-default firmware xc: progress: Reloading memory pages: 56320/11141935% xc: progress: Reloading memory pages: 1003520/1114193 90% DEBUG libxl__blktap_devpath 37 aio:/mnt/gtc_disk_p1/na
Re: [CentOS-virt] Masking CPU flags via libvirt xml not working?
On 8/26/2014 4:52 PM, Nathan March wrote: > > Has anyone here managed to get cpu masking working via libvirt? > Intention to enable VM migrations between hosts of a different CPU > generation. > To add to this, I've tried using the boot options to set the cpu mask instead: xen_commandline: dom0_mem=2048M,max:2048M loglvl=all guest_loglvl=all cpuid_mask_ecx=0x009ee3fd cpuid_mask_edx=0xbfebfbff Unfortunately still no luck. There's no errors in xm dmesg to indicate the settings were / weren't applied, it simply doesn't seem to do anything. - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
[CentOS-virt] Masking CPU flags via libvirt xml not working?
Hi, Has anyone here managed to get cpu masking working via libvirt? Intention to enable VM migrations between hosts of a different CPU generation. Inside my xml I'm providing the model as well as a list of features to specifically disable, but none of it seems to take any effect. On booting the VM I still see the disabled flags in /proc/cpuinfo x86_64 Westmere Doing a dumpxml against the domU once it's booted leaves out the entire section, leading me to think maybe libvirt is dropping it for some reason. I've got the above just in the main section. Anyone have this working or able to offer some suggestions? Thanks! - Nathan ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt