Re: Interrupt time inflation on Xen
On Fri, Apr 15, 2016 at 12:52:36AM -0400, Thor Lancelot Simon wrote: > It definitely does for me (that same dd, but from /dev/rsd0d, goes to > 60% "Interrupt" time on pkgbuild). I can't help noticing everyone who > doesn't see the problem is using wd, while I see it with ciss or arcmsr. I didn't mean to sound as if what you're seeing isn't real. Sorry if I did. > I have trouble seeing how the SCSI code itself could be to blame, but I > wonder if these two drivers have something in common (how they use bus_dma > perhaps)? I was wondering about that. Especially since the wds are on ahcisata and using DMA too. > Now that Manuel fixed profiling, I can confirm at least part of your > suspicion: > > index % timeself childrencalled name > > [1] 85.7 38.100.00 hypercall_page [1] > --- > > This is of course not a terribly useful profiling record since it cannot > find any other functions in the call graph, so we cannot see which hypercall > might be to blame. I *think* the use of static inlines in hypercall.h is > causing that problem, though I don't understand why the "callers" of those > inline functions are missing from the call graph. > > Still puzzling about how to work through this further. Remove the static inlines and see if that lets the callers show up? But I guess you thought of that yourself. --chris
Re: Interrupt time inflation on Xen
On Wed, Mar 30, 2016 at 03:34:42PM +0200, Christoph Badura wrote: > > Maybe there are some very expensive hypervisor operations being called > that lead to all "System" and "Interrupt" time. Perhaps waiting for the > hypervisor to complete the operations. > > Just dd'ing from disk to /dev/null doesn't show increased "Interrupt" time > under xen. (dd if=/dev/rwd0d of=/dev/null bs=64k count=1) > It's about 16% "System" time while the dd runs, otherwise idle. I get > about 125 MB/s throughput. It definitely does for me (that same dd, but from /dev/rsd0d, goes to 60% "Interrupt" time on pkgbuild). I can't help noticing everyone who doesn't see the problem is using wd, while I see it with ciss or arcmsr. I have trouble seeing how the SCSI code itself could be to blame, but I wonder if these two drivers have something in common (how they use bus_dma perhaps)? Now that Manuel fixed profiling, I can confirm at least part of your suspicion: index % timeself childrencalled name [1] 85.7 38.100.00 hypercall_page [1] --- This is of course not a terribly useful profiling record since it cannot find any other functions in the call graph, so we cannot see which hypercall might be to blame. I *think* the use of static inlines in hypercall.h is causing that problem, though I don't understand why the "callers" of those inline functions are missing from the call graph. Still puzzling about how to work through this further. Thor
Re: Interrupt time inflation on Xen
On Wed, Mar 23, 2016 at 11:01:08AM +0100, Manuel Bouyer wrote: > Here's a few tests I ran. > Hardware is a Dell optiplex 755 (core 2 duo, WDC WD20EARS-00MVWB0, on Intel > AHCI): [results snipped] You're not seeing anything close to the out-of-control interrupt time I'm seeing. My hardware, maybe? One thing I notice is that on this hardware, it is impossible to disable VT-D. We saw a problem on another box where with VT-D, interrupts were never delivered... but that is not what we are seeing here. Thor
Re: Interrupt time inflation on Xen
Here's a few tests I ran. Hardware is a Dell optiplex 755 (core 2 duo, WDC WD20EARS-00MVWB0, on Intel AHCI): Xen45/netbsd-6: borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10 10+0 records in 10+0 records out 655360 bytes transferred in 53.957 secs (121459680 bytes/sec) systat says: 11.4% Sy 0.8% Us 0.0% Ni 2.4% In 85.4% Id 1850 ops/s, 1850 irq/s on ioapic0 pin 18 netbsd-6 bare metal, booted with -1 (as XEN3_DOM0 is not SMP yet): borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10 10+0 records in 10+0 records out 655360 bytes transferred in 53.882 secs (121628744 bytes/sec) systat says: 2.6% Sy 0.0% Us 0.0% Ni 1.8% In 95.6% Id about 1850 ops/s, 1850 irq/s on ioapic0 pin 18 xen45/netbsd-7: borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10 10+0 records in 10+0 records out 655360 bytes transferred in 54.258 secs (120785874 bytes/sec) systat says: 15.8% Sy 0.8% Us 0.0% Ni 0.6% In 82.8% Idpages about 1850 ops/s, 1850 irq/s on ioapic0 pin 18 netbsd-7 bare metal, booted with -1 (as XEN3_DOM0 is not SMP yet): borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10 10+0 records in 10+0 records out 655360 bytes transferred in 54.175 secs (120970927 bytes/sec) systat says: 3.6% Sy 0.2% Us 0.0% Ni 1.4% In 94.8% Id about 1850 ops/s, 1850 irq/s on ioapic0 pin 18 xen31/netbsd-7: borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10 10+0 records in 10+0 records out systat says: 15.6% Sy 0.2% Us 0.0% Ni 0.8% In 83.4% Id about 1850 ops/s, 1850 irq/s on ioapic0 pin 18 No, I'm not seeing twice the interrupts as you see with the SCSI controllers. I suspect it could be interrupt coalescing, which may kick in with a bare-metal kernel and not with the Xen one, as the Xen one is slower. For I/Os, Xen is not slower in my case but uses much more system time. This is probably because on this hardware, the limiting factor is the hard disk and not the CPU. This time is not spent in hard interrupt handlers, I actually think it's MMU-related (page remapping or invalidations). It makes sense, as in the Xen/PV case these operations needs an hypercall. There's no big differences between netbsd-6 and netbsd-7, or Xen versions. I re-ran tests with the attached program, which should avoid reading data from disk (the data should be in the disks's cache, so we're benchmarking the SATA link/controller, and the data move between kernel and userland). On Xen/netbsd-7: 23.4% Sy 0.4% Us 0.0% Ni 0.8% In 75.4% Id borneo:/home/bouyer#./tst /dev/rwd0d 10 37753400 us, 165.548004 MB/s UP bare-metal: 4.0% Sy 0.0% Us 0.0% Ni 3.6% In 92.4% Id borneo:/home/bouyer#./tst /dev/rwd0d 10 30945314 us, 201.969190 MB/s Here we have a true difference (also, Xen/netbsd-7 can't keep the disk more than 70% busy, while bare metal is more than 99% - maybe because interrupt latency is higher with Xen). The result is the same with different Xen versions. Maybe some improvement is possible in pmap.c, this needs to be looked at. But I don't think we'll get more than a few %. Maybe also a XEN3PAE_DOM0 would do better (with amd64/xen the kernel runs unprivileged and has a completely separate vm space from userland, while i386PAE kernel runs in ring 1 and shares its vm space with userland - this makes a big difference in pmap). I never tried, but it should be possible to run a amd64 Xen with a i386pae dom0. The real fix would be to run the dom0 kernel as a HVM domain on hardware that supports it. But this requires quite a bit of work. -- Manuel BouyerNetBSD: 26 ans d'experience feront toujours la difference #include #include #include main(int argc, char **argv) { static char buf[64*1024]; int fd, i; struct timeval tv0, tv1; int t; fd = open(argv[1], O_RDONLY, 0); if (fd < 0) { perror("open"); exit(1); } if (gettimeofday(, NULL) < 0) { perror("gettimeofday"); exit(1); } for (i = 0; i < atoi(argv[2]); i++) { if (read(fd, buf, sizeof(buf)) != sizeof(buf)) { perror("read"); exit(1); } if (lseek(fd, 0, SEEK_SET) < 0) { perror("seek"); exit(1); } } if (gettimeofday(, NULL) < 0) { perror("gettimeofday"); exit(1); } t = (tv1.tv_sec - tv0.tv_sec) * 100; t = t + tv1.tv_usec - tv0.tv_usec; printf("%d us, %f MB/s\n", t, ((double)64 * (double)i / 1024) / ((double)t / 100)); exit(0); }
Re: Interrupt time inflation on Xen
On Tue, Mar 22, 2016 at 07:22:28PM -0400, Thor Lancelot Simon wrote: > On Tue, Mar 22, 2016 at 01:28:06PM -0400, Greg Troxel wrote: > > > > I wonder if this is a netbsd-7 vs netbsd-6 thing. Or only triggered by > > some interrupt mappings. > > I'm wondering a few things: > > Which version of Xen are the rest of you using? I'm on 4.5. > > Are the ATA drivers you're using still giant-locked, or are they now > MPSAFE (I'm digging, but I'm at the end of a long thin network pipe > today with only very old source trees on hand)? Both the ciss and arcmsr > where I've seen this problem are SCSI, and the SCSI subsystem takes > KERNEL_LOCK. (S)ATA is not MPSAFE -- Manuel BouyerNetBSD: 26 ans d'experience feront toujours la difference --
Re: Interrupt time inflation on Xen
On Tue, Mar 22, 2016 at 01:28:06PM -0400, Greg Troxel wrote: > > I wonder if this is a netbsd-7 vs netbsd-6 thing. Or only triggered by > some interrupt mappings. I'm wondering a few things: Which version of Xen are the rest of you using? I'm on 4.5. Are the ATA drivers you're using still giant-locked, or are they now MPSAFE (I'm digging, but I'm at the end of a long thin network pipe today with only very old source trees on hand)? Both the ciss and arcmsr where I've seen this problem are SCSI, and the SCSI subsystem takes KERNEL_LOCK. -- Thor Lancelot Simont...@panix.com "We cannot usually in social life pursue a single value or a single moral aim, untroubled by the need to compromise with others." - H.L.A. Hart
Re: Interrupt time inflation on Xen
Thor Lancelot Simonwrites: > The same test's easy to reproduce with any Xen dom0, since all you have > to do is dd from the raw disk into /dev/null. Do you see >40% interrupt > time and all your idle CPU go away? That's the issue. 14% sys, mostly idle 1000 interrupts on what looks like the disk ioapic pin, and 1000 xfer/s, total interrupts 1000-200. netbsd-6, i386 XEN3PAE_DOM0 Also ok on netbsd-5 XEN3_DOM0 amd64 (disk at disk speed, low sys, right number of disk interrupts, mostly idle). Both systems have normal mobo disk contr1ollers. The second is: acpi0: X/RSDT: OemId , AslId <,0113> piixide0 at pci0 dev 31 function 2 piixide0: Intel 82801I Serial ATA Controller (ICH9) (rev. 0x02) piixide0: bus-master DMA support present piixide0: primary channel configured to native-PCI mode piixide0: using ioapic0 pin 21, event channel 5 for native-PCI interrupt atabus2 at piixide0 channel 0 piixide0: secondary channel configured to native-PCI mode atabus3 at piixide0 channel 1 piixide1 at pci0 dev 31 function 5 piixide1: Intel 82801I Serial ATA Controller (ICH9) (rev. 0x02) piixide1: bus-master DMA support present piixide1: primary channel wired to native-PCI mode piixide1: using ioapic0 pin 21, event channel 5 for native-PCI interrupt atabus4 at piixide1 channel 0 piixide1: secondary channel wired to native-PCI mode atabus5 at piixide1 channel 1 wd0 at atabus2 drive 0: wd0: drive supports 16-sector PIO transfers, LBA48 addressing wd0: 698 GB, 1453521 cyl, 16 head, 63 sec, 512 bytes/sect x 1465149168 sectors wd0: 32-bit data port wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA) I wonder if this is a netbsd-7 vs netbsd-6 thing. Or only triggered by some interrupt mappings. signature.asc Description: PGP signature
Re: Interrupt time inflation on Xen
On Tue, Mar 22, 2016 at 09:14:33AM +0100, Manuel Bouyer wrote: > On Mon, Mar 21, 2016 at 07:10:51PM -0400, Thor Lancelot Simon wrote: > > > What's particularly curious here to me is that we are seeing > > twice as many interrupts as we should, in the poorly performing Xen > > case. > > What controller is it ? maybe when running faster it's able > to coalesce some interrupts ? > The irq rate seems to be mostly the same in both cases, but with the > number of I/O ops cut by half in the Xen case. The controller is a ciss. I observed the same problems with an arcmsr. I don't think this has anything to do with the controller. At 1/2 the data and request rate, under Xen we use 60% CPU as interrupt time while with a native kernel, it's 1-3%. The same test's easy to reproduce with any Xen dom0, since all you have to do is dd from the raw disk into /dev/null. Do you see >40% interrupt time and all your idle CPU go away? That's the issue. Thor
Re: Interrupt time inflation on Xen
On Mon, Mar 21, 2016 at 07:10:51PM -0400, Thor Lancelot Simon wrote: > On Sun, Mar 20, 2016 at 08:19:23PM -0400, Thor Lancelot Simon wrote: > > > > I've attached the dmesg from the GENERIC kernel and the boot-time > > messages (giving both the xen dmesg and NetBSD dmesg) from the XEN3_DOM0 > > kernel. Xen kernel is 4.5 from pkgsrc; it is NOT a debug kernel. > > > > Interrupt routing seems to be the same, disk controller is on ioapic1 pin 6 > > in both cases. > > However, the Xen kernel emits: > > ioapic1 at mainbus0 apid 7 > ioapic1: can't remap to apid 7 > > Harmless? I think it is, as it's trying to remap to the same value (AFAIK). But I don't know the x86 interrupt hardware that well. > What's particularly curious here to me is that we are seeing > twice as many interrupts as we should, in the poorly performing Xen > case. What controller is it ? maybe when running faster it's able to coalesce some interrupts ? The irq rate seems to be mostly the same in both cases, but with the number of I/O ops cut by half in the Xen case. -- Manuel BouyerNetBSD: 26 ans d'experience feront toujours la difference --
Re: Interrupt time inflation on Xen
On Sun, Mar 20, 2016 at 08:19:23PM -0400, Thor Lancelot Simon wrote: > > I've attached the dmesg from the GENERIC kernel and the boot-time > messages (giving both the xen dmesg and NetBSD dmesg) from the XEN3_DOM0 > kernel. Xen kernel is 4.5 from pkgsrc; it is NOT a debug kernel. > > Interrupt routing seems to be the same, disk controller is on ioapic1 pin 6 > in both cases. However, the Xen kernel emits: ioapic1 at mainbus0 apid 7 ioapic1: can't remap to apid 7 Harmless? What's particularly curious here to me is that we are seeing twice as many interrupts as we should, in the poorly performing Xen case. Thor
Re: Interrupt time inflation on Xen
Same machine, new disk controller, new "disks" (SSDs), made it a pure disk I/O test. Same results: dd if=/dev/rsd0d of=/dev/null bs=1m With a netbsd-7 branch GENERIC kernel, I get: * 640MB/sec * about 2% interrupt time (system mostly idle) * about 16,500 interrupts/sec, of which 10,000 are TLB shootdown IPIs and 6,500 are from the disk controller, which unsurprisingly is doing 6,500 IOPS. With a netbsd-7 branch XEN3_DOM0 kernel, I get: * 300MB/sec * about 60% interrupt time (system noticeably laggy) * about 6,000 interrupts/sec, all of which seem to be "from" the disk controller, which is doing only about 3,000 IOPS. I've attached the dmesg from the GENERIC kernel and the boot-time messages (giving both the xen dmesg and NetBSD dmesg) from the XEN3_DOM0 kernel. Xen kernel is 4.5 from pkgsrc; it is NOT a debug kernel. Interrupt routing seems to be the same, disk controller is on ioapic1 pin 6 in both cases. Thor Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 7.0 (GENERIC.201509250726Z) total memory = 73718 MB avail memory = 71562 MB kern.module.path=/stand/amd64/7.0/modules timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100 Dell CS24-TY(A00 ) mainbus0 (root) ACPI: RSDP 0xfadd0 24 (v02 ACPIAM) ACPI: XSDT 0xbf6f0100 8C (v01 110410 XSDT1337 20101104 MSFT 0097) ACPI: FACP 0xbf6f0290 F4 (v03 110410 FACP1337 20101104 MSFT 0097) ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/64 (20131218/tbfadt-634) ACPI: DSDT 0xbf6f05d0 005DDD (v01 DS993 DS993B16 0B16 INTL 20051117) ACPI: FACS 0xbf6fe000 40 ACPI: APIC 0xbf6f0390 0001AE (v01 110410 APIC1337 20101104 MSFT 0097) ACPI: SPCR 0xbf6f0540 50 (v01 110410 SPCR1337 20101104 MSFT 0097) ACPI: MCFG 0xbf6f0590 3C (v01 110410 OEMMCFG 20101104 MSFT 0097) ACPI: SSDT 0xbf6fa5d0 2A (v01 OEM_ID OEMTBLID 0001 INTL 20051117) ACPI: OEMB 0xbf6fe040 83 (v01 110410 OEMB1337 20101104 MSFT 0097) ACPI: HPET 0xbf6fa600 38 (v01 110410 OEMHPET 20101104 MSFT 0097) ACPI: DMAR 0xbf6fe0d0 000128 (v01AMI OEMDMAR 0001 MSFT 0097) ACPI: SSDT 0xbf7033c0 000363 (v01 DpgPmmCpuPm 0012 INTL 20051117) ACPI: EINJ 0xbf6fa640 000130 (v01 AMIER AMI_EINJ 20101104 MSFT 0097) ACPI: BERT 0xbf6fa7d0 30 (v01 AMIER AMI_BERT 20101104 MSFT 0097) ACPI: ERST 0xbf6fa800 0001B0 (v01 AMIER AMI_ERST 20101104 MSFT 0097) ACPI: HEST 0xbf6fa9b0 A8 (v01 AMIER ABC_HEST 20101104 MSFT 0097) ACPI: All ACPI Tables successfully acquired ioapic0 at mainbus0 apid 6: pa 0xfec0, version 0x20, 24 pins ioapic1 at mainbus0 apid 7: pa 0xfec8a000, version 0x20, 24 pins cpu0 at mainbus0 apid 0: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu1 at mainbus0 apid 2: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu2 at mainbus0 apid 4: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu3 at mainbus0 apid 16: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu4 at mainbus0 apid 18: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu5 at mainbus0 apid 20: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu6 at mainbus0 apid 32: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu7 at mainbus0 apid 34: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu8 at mainbus0 apid 36: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu9 at mainbus0 apid 48: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu10 at mainbus0 apid 50: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu11 at mainbus0 apid 52: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu12 at mainbus0 apid 1: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu13 at mainbus0 apid 3: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu14 at mainbus0 apid 5: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu15 at mainbus0 apid 17: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu16 at mainbus0 apid 19: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu17 at mainbus0 apid 21: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu18 at mainbus0 apid 33: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu19 at mainbus0 apid 35: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu20 at mainbus0 apid 37: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz, id 0x206c2 cpu21 at mainbus0 apid 49: Intel(R) Xeon(R) CPU L5639
Re: Interrupt time inflation on Xen
Le 2016-03-14 13:40, Thor Lancelot Simon a écrit : On Mon, Mar 14, 2016 at 12:10:29PM +0100, Jean-Yves Migeon wrote: And same question applies if you reduce bs (128m from 1g for example) I'll run the other test today or tomorrow -- this one's a "no" -- the problem is just as bad with bs=64k Alright. (did I say I was using a 1g blocksize before? I wasn't, I was using a 1m blocksize). Fair point :o When you get used to 1K blocks output... -- Jean-Yves Migeon
Re: Interrupt time inflation on Xen
On Mon, Mar 14, 2016 at 12:10:29PM +0100, Jean-Yves Migeon wrote: > > And same question applies if you reduce bs (128m from 1g for example) I'll run the other test today or tomorrow -- this one's a "no" -- the problem is just as bad with bs=64k (did I say I was using a 1g blocksize before? I wasn't, I was using a 1m blocksize). -- Thor Lancelot Simont...@panix.com "We cannot usually in social life pursue a single value or a single moral aim, untroubled by the need to compromise with others." - H.L.A. Hart
Re: Interrupt time inflation on Xen
On 2016-03-14 00:08, Thor Lancelot Simon wrote: I had occasion a couple of days ago to try to block-copy a very large filesystem from a xen dom0 to another machine across a fast local network. I tried this: sysctl -w kern.sbmax=1000 sysctl -w net.inet.tcp.sendbuf_auto=0 dd if=/dev/rsd0g bs=1048576 | ttcp -t -b 2097152 -l 131072 -fm -p 9001 172.0.0.1 On the far end, I had simply the corresponding ttcp -r job (with corresponding socket/tcp sysctls in place) redirected to a file. The underlying hardware is a RAID5 of 300GB SAS drives on an Areca 1680. The Ethernet controller in use is a 'wm'. All supported offload functionality is turned on (turning it off just slows things down). Shockingly, I saw the system go to 0% idle time, with 45-55% "Interrupt" and the rest "System". Interrupts per second were a comparatively low 1500, about 500 disk and 1000 network. Throughput was horrible -- about 35MB/sec. This might not be related, or useful, but it do sound similar to what I've been seeing on VAXen for a few years now. I get unexpectedly high interrupt rates, but most significantly, I can get ridiculous system time when doing something like a "cvs update". I have never had the time to try and figure it out, but I normally have over 50% (usually around 70%) in system when doing this. Which does not seem reasonable. But once more, this might be a red herring, so feel free to ignore. Johnny
Re: Interrupt time inflation on Xen
Le 2016-03-14 00:08, Thor Lancelot Simon a écrit : [snip] I am going to install a different disk controller and see what happens, but I am not totally optimistic. Does anyone have any idea what might be going on? Nope, but I would be interested in the throughput you get when you pipe dd if=/dev/rsd0g to /dev/null and dev/zero to ttcp (withou gzip), simultaneously. Do you see any difference in the disk throughput when stopping ttcp? And same question applies if you reduce bs (128m from 1g for example) -- Jean-Yves Migeon
Re: Interrupt time inflation on Xen
On Mon, Mar 14, 2016 at 02:53:54AM +0100, Pierre Pronchery wrote: > On 03/14/16 00:08, Thor Lancelot Simon wrote: > > I had occasion a couple of days ago to try to block-copy a very large > > filesystem from a xen dom0 to another machine across a fast local network. > > [...] > > > > I am going to install a different disk controller and see what happens, > > but I am not totally optimistic. Does anyone have any idea what might be > > going on? > > Try this pull-up: > https://mail-index.netbsd.org/source-changes/2015/11/16/msg070367.html Looks unlikely -- why would this impact what I am doing strictly on the dom0, with no guests running? Dom0 does not access its disks via xbdback. Thor
Interrupt time inflation on Xen
I had occasion a couple of days ago to try to block-copy a very large filesystem from a xen dom0 to another machine across a fast local network. I tried this: sysctl -w kern.sbmax=1000 sysctl -w net.inet.tcp.sendbuf_auto=0 dd if=/dev/rsd0g bs=1048576 | ttcp -t -b 2097152 -l 131072 -fm -p 9001 172.0.0.1 On the far end, I had simply the corresponding ttcp -r job (with corresponding socket/tcp sysctls in place) redirected to a file. The underlying hardware is a RAID5 of 300GB SAS drives on an Areca 1680. The Ethernet controller in use is a 'wm'. All supported offload functionality is turned on (turning it off just slows things down). Shockingly, I saw the system go to 0% idle time, with 45-55% "Interrupt" and the rest "System". Interrupts per second were a comparatively low 1500, about 500 disk and 1000 network. Throughput was horrible -- about 35MB/sec. I rebooted the system to a GENERIC kernel; throughput was 100MB/sec and peaked at 200MB/sec when I piped the data through gzip -1. The system was mostly idle with under 2% "Interrupt" time. Interrupts/sec were significantly higher due to both the higher data rate and about 800 TLB shootdown interrupts/sec (the GENERIC kernel runs MP on this system). Kernels are both netbsd-7 branch from just around the time of the release, XEN3_DOM0 and GENERIC. Neither kernel has DEBUG nor DIAGNOSTIC nor KMEMSTATS. Do we have some kind of terrible problem that causes a 20% increase in interrupt time when running under Xen? Note the Areca driver (arcmsr) is a SCSI driver so it is giant-locked, but the XEN dom0 is UP, so this is probably a red herring. I am going to install a different disk controller and see what happens, but I am not totally optimistic. Does anyone have any idea what might be going on? -- Thor Lancelot Simont...@panix.com "We cannot usually in social life pursue a single value or a single moral aim, untroubled by the need to compromise with others." - H.L.A. Hart