Re: Interrupt time inflation on Xen

2016-04-15 Thread Christoph Badura
On Fri, Apr 15, 2016 at 12:52:36AM -0400, Thor Lancelot Simon wrote:
> It definitely does for me (that same dd, but from /dev/rsd0d, goes to
> 60% "Interrupt" time on pkgbuild).  I can't help noticing everyone who
> doesn't see the problem is using wd, while I see it with ciss or arcmsr.

I didn't mean to sound as if what you're seeing isn't real.  Sorry if I
did.

> I have trouble seeing how the SCSI code itself could be to blame, but I
> wonder if these two drivers have something in common (how they use bus_dma
> perhaps)?

I was wondering about that.  Especially since the wds are on ahcisata and
using DMA too.

> Now that Manuel fixed profiling, I can confirm at least part of your
> suspicion:
> 
> index % timeself  childrencalled name
>  
> [1] 85.7   38.100.00 hypercall_page [1]
> --- 
> 
> This is of course not a terribly useful profiling record since it cannot
> find any other functions in the call graph, so we cannot see which hypercall
> might be to blame.  I *think* the use of static inlines in hypercall.h is
> causing that problem, though I don't understand why the "callers" of those
> inline functions are missing from the call graph.
> 
> Still puzzling about how to work through this further.

Remove the static inlines and see if that lets the callers show up?  But I
guess you thought of that yourself.

--chris


Re: Interrupt time inflation on Xen

2016-04-14 Thread Thor Lancelot Simon
On Wed, Mar 30, 2016 at 03:34:42PM +0200, Christoph Badura wrote:
> 
> Maybe there are some very expensive hypervisor operations being called
> that lead to all "System" and "Interrupt" time.  Perhaps waiting for the
> hypervisor to complete the operations.
> 
> Just dd'ing from disk to /dev/null doesn't show increased "Interrupt" time
> under xen.  (dd if=/dev/rwd0d of=/dev/null bs=64k count=1)
> It's about 16% "System" time while the dd runs, otherwise idle.  I get
> about 125 MB/s throughput.

It definitely does for me (that same dd, but from /dev/rsd0d, goes to
60% "Interrupt" time on pkgbuild).  I can't help noticing everyone who
doesn't see the problem is using wd, while I see it with ciss or arcmsr.

I have trouble seeing how the SCSI code itself could be to blame, but I
wonder if these two drivers have something in common (how they use bus_dma
perhaps)?

Now that Manuel fixed profiling, I can confirm at least part of your
suspicion:

index % timeself  childrencalled name
 
[1] 85.7   38.100.00 hypercall_page [1]
--- 

This is of course not a terribly useful profiling record since it cannot
find any other functions in the call graph, so we cannot see which hypercall
might be to blame.  I *think* the use of static inlines in hypercall.h is
causing that problem, though I don't understand why the "callers" of those
inline functions are missing from the call graph.

Still puzzling about how to work through this further.

Thor


Re: Interrupt time inflation on Xen

2016-03-23 Thread Thor Lancelot Simon
On Wed, Mar 23, 2016 at 11:01:08AM +0100, Manuel Bouyer wrote:
> Here's a few tests I ran.
> Hardware is a Dell optiplex 755 (core 2 duo, WDC WD20EARS-00MVWB0, on Intel
> AHCI):

[results snipped]

You're not seeing anything close to the out-of-control interrupt time
I'm seeing.  My hardware, maybe?

One thing I notice is that on this hardware, it is impossible to disable
VT-D.  We saw a problem on another box where with VT-D, interrupts were
never delivered... but that is not what we are seeing here.

Thor


Re: Interrupt time inflation on Xen

2016-03-23 Thread Manuel Bouyer
Here's a few tests I ran.
Hardware is a Dell optiplex 755 (core 2 duo, WDC WD20EARS-00MVWB0, on Intel
AHCI):

Xen45/netbsd-6:
borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10
10+0 records in
10+0 records out
655360 bytes transferred in 53.957 secs (121459680 bytes/sec)
systat says:
  11.4% Sy   0.8% Us   0.0% Ni   2.4% In  85.4% Id
1850 ops/s, 1850 irq/s on ioapic0 pin 18

netbsd-6 bare metal, booted with -1 (as XEN3_DOM0 is not SMP yet):
borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10
10+0 records in
10+0 records out
655360 bytes transferred in 53.882 secs (121628744 bytes/sec)
systat says:
   2.6% Sy   0.0% Us   0.0% Ni   1.8% In  95.6% Id
about 1850 ops/s, 1850 irq/s on ioapic0 pin 18

xen45/netbsd-7:
borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10
10+0 records in
10+0 records out
655360 bytes transferred in 54.258 secs (120785874 bytes/sec)
systat says:
  15.8% Sy   0.8% Us   0.0% Ni   0.6% In  82.8% Idpages
about 1850 ops/s, 1850 irq/s on ioapic0 pin 18

netbsd-7 bare metal, booted with -1 (as XEN3_DOM0 is not SMP yet):
borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10
10+0 records in
10+0 records out
655360 bytes transferred in 54.175 secs (120970927 bytes/sec)
systat says:
   3.6% Sy   0.2% Us   0.0% Ni   1.4% In  94.8% Id
about 1850 ops/s, 1850 irq/s on ioapic0 pin 18

xen31/netbsd-7:
borneo:/home/bouyer#dd if=/dev/rwd0d of=/dev/null bs=64k count=10
10+0 records in
10+0 records out
systat says:
  15.6% Sy   0.2% Us   0.0% Ni   0.8% In  83.4% Id
about 1850 ops/s, 1850 irq/s on ioapic0 pin 18

No, I'm not seeing twice the interrupts as you see with the SCSI controllers.
I suspect it could be interrupt coalescing, which may kick in with a bare-metal
kernel and not with the Xen one, as the Xen one is slower.

For I/Os, Xen is not slower in my case but uses much more system time.
This is probably because on this hardware, the limiting factor is the
hard disk and not the CPU.
This time is not spent in hard interrupt handlers, I actually think it's
MMU-related (page remapping or invalidations). It makes sense, as in the
Xen/PV case these operations needs an hypercall. There's no big differences
between netbsd-6 and netbsd-7, or Xen versions.

I re-ran tests with the attached program, which should avoid reading data
from disk (the data should be in the disks's cache, so we're benchmarking
the SATA link/controller, and the data move between kernel and userland).

On Xen/netbsd-7:
  23.4% Sy   0.4% Us   0.0% Ni   0.8% In  75.4% Id
borneo:/home/bouyer#./tst /dev/rwd0d 10
37753400 us, 165.548004 MB/s
UP bare-metal:
   4.0% Sy   0.0% Us   0.0% Ni   3.6% In  92.4% Id
borneo:/home/bouyer#./tst /dev/rwd0d 10
30945314 us, 201.969190 MB/s

Here we have a true difference (also, Xen/netbsd-7 can't keep the disk more
than 70% busy, while bare metal is more than 99% - maybe because interrupt
latency is higher with Xen). The result is the same with different Xen 
versions.

Maybe some improvement is possible in pmap.c, this needs to be looked at.
But I don't think we'll get more than a few %.

Maybe also a XEN3PAE_DOM0 would do better (with amd64/xen the kernel
runs unprivileged and has a completely separate vm space from
userland, while i386PAE kernel runs in ring 1 and shares its vm space with
userland - this makes a big difference in pmap). I never tried, but it should
be possible to run a amd64 Xen with a i386pae dom0.

The real fix would be to run the dom0 kernel as a HVM domain on hardware
that supports it. But this requires quite a bit of work.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference

#include 
#include 
#include 

main(int argc, char **argv)
{
static char buf[64*1024];
int fd, i;
struct timeval tv0, tv1;
int t;

fd = open(argv[1], O_RDONLY, 0);
if (fd < 0) {
perror("open");
exit(1);
}
if (gettimeofday(, NULL) < 0) {
perror("gettimeofday");
exit(1);
}
for (i = 0; i < atoi(argv[2]); i++) {
if (read(fd, buf, sizeof(buf)) != sizeof(buf)) {
perror("read");
exit(1);
}
if (lseek(fd, 0, SEEK_SET) < 0) {
perror("seek");
exit(1);
}

}
if (gettimeofday(, NULL) < 0) {
perror("gettimeofday");
exit(1);
}
t = (tv1.tv_sec - tv0.tv_sec) * 100;
t = t + tv1.tv_usec - tv0.tv_usec;
printf("%d us, %f MB/s\n", t,
((double)64 * (double)i / 1024) / ((double)t / 100));
exit(0);
}


Re: Interrupt time inflation on Xen

2016-03-23 Thread Manuel Bouyer
On Tue, Mar 22, 2016 at 07:22:28PM -0400, Thor Lancelot Simon wrote:
> On Tue, Mar 22, 2016 at 01:28:06PM -0400, Greg Troxel wrote:
> > 
> > I wonder if this is a netbsd-7 vs netbsd-6 thing.  Or only triggered by
> > some interrupt mappings.
> 
> I'm wondering a few things:
> 
> Which version of Xen are the rest of you using?  I'm on 4.5.
> 
> Are the ATA drivers you're using still giant-locked, or are they now
> MPSAFE (I'm digging, but I'm at the end of a long thin network pipe
> today with only very old source trees on hand)?  Both the ciss and arcmsr
> where I've seen this problem are SCSI, and the SCSI subsystem takes
> KERNEL_LOCK.

(S)ATA is not MPSAFE

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Interrupt time inflation on Xen

2016-03-22 Thread Thor Lancelot Simon
On Tue, Mar 22, 2016 at 01:28:06PM -0400, Greg Troxel wrote:
> 
> I wonder if this is a netbsd-7 vs netbsd-6 thing.  Or only triggered by
> some interrupt mappings.

I'm wondering a few things:

Which version of Xen are the rest of you using?  I'm on 4.5.

Are the ATA drivers you're using still giant-locked, or are they now
MPSAFE (I'm digging, but I'm at the end of a long thin network pipe
today with only very old source trees on hand)?  Both the ciss and arcmsr
where I've seen this problem are SCSI, and the SCSI subsystem takes
KERNEL_LOCK.

-- 
  Thor Lancelot Simont...@panix.com

  "We cannot usually in social life pursue a single value or a single moral
   aim, untroubled by the need to compromise with others."  - H.L.A. Hart


Re: Interrupt time inflation on Xen

2016-03-22 Thread Greg Troxel

Thor Lancelot Simon  writes:

> The same test's easy to reproduce with any Xen dom0, since all you have
> to do is dd from the raw disk into /dev/null.  Do you see >40% interrupt
> time and all your idle CPU go away?  That's the issue.

14% sys, mostly idle 1000 interrupts on what looks like the disk ioapic
pin, and 1000 xfer/s, total interrupts 1000-200.
netbsd-6, i386 XEN3PAE_DOM0

Also ok on netbsd-5 XEN3_DOM0 amd64 (disk at disk speed, low sys, right
number of disk interrupts, mostly idle).

Both systems have normal mobo disk contr1ollers.  The second is:

acpi0: X/RSDT: OemId , AslId <,0113>
piixide0 at pci0 dev 31 function 2
piixide0: Intel 82801I Serial ATA Controller (ICH9) (rev. 0x02)
piixide0: bus-master DMA support present
piixide0: primary channel configured to native-PCI mode
piixide0: using ioapic0 pin 21, event channel 5 for native-PCI interrupt
atabus2 at piixide0 channel 0
piixide0: secondary channel configured to native-PCI mode
atabus3 at piixide0 channel 1
piixide1 at pci0 dev 31 function 5
piixide1: Intel 82801I Serial ATA Controller (ICH9) (rev. 0x02)
piixide1: bus-master DMA support present
piixide1: primary channel wired to native-PCI mode
piixide1: using ioapic0 pin 21, event channel 5 for native-PCI interrupt
atabus4 at piixide1 channel 0
piixide1: secondary channel wired to native-PCI mode
atabus5 at piixide1 channel 1
wd0 at atabus2 drive 0: 
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 698 GB, 1453521 cyl, 16 head, 63 sec, 512 bytes/sect x 1465149168 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)

I wonder if this is a netbsd-7 vs netbsd-6 thing.  Or only triggered by
some interrupt mappings.



signature.asc
Description: PGP signature


Re: Interrupt time inflation on Xen

2016-03-22 Thread Thor Lancelot Simon
On Tue, Mar 22, 2016 at 09:14:33AM +0100, Manuel Bouyer wrote:
> On Mon, Mar 21, 2016 at 07:10:51PM -0400, Thor Lancelot Simon wrote:
> 
> > What's particularly curious here to me is that we are seeing
> > twice as many interrupts as we should, in the poorly performing Xen
> > case.
> 
> What controller is it ? maybe when running faster it's able
> to coalesce some interrupts ?
> The irq rate seems to be mostly the same in both cases, but with the
> number of I/O ops cut by half in the Xen case.

The controller is a ciss.  I observed the same problems with an arcmsr.

I don't think this has anything to do with the controller.  At 1/2 the
data and request rate, under Xen we use 60% CPU as interrupt time
while with a native kernel, it's 1-3%.

The same test's easy to reproduce with any Xen dom0, since all you have
to do is dd from the raw disk into /dev/null.  Do you see >40% interrupt
time and all your idle CPU go away?  That's the issue.

Thor


Re: Interrupt time inflation on Xen

2016-03-22 Thread Manuel Bouyer
On Mon, Mar 21, 2016 at 07:10:51PM -0400, Thor Lancelot Simon wrote:
> On Sun, Mar 20, 2016 at 08:19:23PM -0400, Thor Lancelot Simon wrote:
> > 
> > I've attached the dmesg from the GENERIC kernel and the boot-time
> > messages (giving both the xen dmesg and NetBSD dmesg) from the XEN3_DOM0
> > kernel.  Xen kernel is 4.5 from pkgsrc; it is NOT a debug kernel.
> > 
> > Interrupt routing seems to be the same, disk controller is on ioapic1 pin 6
> > in both cases.
> 
> However, the Xen kernel emits:
> 
> ioapic1 at mainbus0 apid 7
> ioapic1: can't remap to apid 7
> 
> Harmless?

I think it is, as it's trying to remap to the same value (AFAIK).
But I don't know the x86 interrupt hardware that well.


> What's particularly curious here to me is that we are seeing
> twice as many interrupts as we should, in the poorly performing Xen
> case.

What controller is it ? maybe when running faster it's able
to coalesce some interrupts ?
The irq rate seems to be mostly the same in both cases, but with the
number of I/O ops cut by half in the Xen case.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Interrupt time inflation on Xen

2016-03-21 Thread Thor Lancelot Simon
On Sun, Mar 20, 2016 at 08:19:23PM -0400, Thor Lancelot Simon wrote:
> 
> I've attached the dmesg from the GENERIC kernel and the boot-time
> messages (giving both the xen dmesg and NetBSD dmesg) from the XEN3_DOM0
> kernel.  Xen kernel is 4.5 from pkgsrc; it is NOT a debug kernel.
> 
> Interrupt routing seems to be the same, disk controller is on ioapic1 pin 6
> in both cases.

However, the Xen kernel emits:

ioapic1 at mainbus0 apid 7
ioapic1: can't remap to apid 7

Harmless?  What's particularly curious here to me is that we are seeing
twice as many interrupts as we should, in the poorly performing Xen
case.

Thor


Re: Interrupt time inflation on Xen

2016-03-20 Thread Thor Lancelot Simon
Same machine, new disk controller, new "disks" (SSDs), made it a
pure disk I/O test.  Same results:

dd if=/dev/rsd0d of=/dev/null bs=1m

With a netbsd-7 branch GENERIC kernel, I get:

* 640MB/sec
* about 2% interrupt time (system mostly idle)
* about 16,500 interrupts/sec,
  of which 10,000 are TLB shootdown IPIs and 6,500 are from the
  disk controller, which unsurprisingly is doing 6,500 IOPS.

With a netbsd-7 branch XEN3_DOM0 kernel, I get:

* 300MB/sec
* about 60% interrupt time (system noticeably laggy)
* about 6,000 interrupts/sec, all of which seem to be
  "from" the disk controller, which is doing only about 3,000 IOPS.

I've attached the dmesg from the GENERIC kernel and the boot-time
messages (giving both the xen dmesg and NetBSD dmesg) from the XEN3_DOM0
kernel.  Xen kernel is 4.5 from pkgsrc; it is NOT a debug kernel.

Interrupt routing seems to be the same, disk controller is on ioapic1 pin 6
in both cases.

Thor
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 7.0 (GENERIC.201509250726Z)
total memory = 73718 MB
avail memory = 71562 MB
kern.module.path=/stand/amd64/7.0/modules
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell   CS24-TY(A00   )
mainbus0 (root)
ACPI: RSDP 0xfadd0 24 (v02 ACPIAM)
ACPI: XSDT 0xbf6f0100 8C (v01 110410 XSDT1337 20101104 MSFT 0097)
ACPI: FACP 0xbf6f0290 F4 (v03 110410 FACP1337 20101104 MSFT 0097)
ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/64 
(20131218/tbfadt-634)
ACPI: DSDT 0xbf6f05d0 005DDD (v01  DS993 DS993B16 0B16 INTL 20051117)
ACPI: FACS 0xbf6fe000 40
ACPI: APIC 0xbf6f0390 0001AE (v01 110410 APIC1337 20101104 MSFT 0097)
ACPI: SPCR 0xbf6f0540 50 (v01 110410 SPCR1337 20101104 MSFT 0097)
ACPI: MCFG 0xbf6f0590 3C (v01 110410 OEMMCFG  20101104 MSFT 0097)
ACPI: SSDT 0xbf6fa5d0 2A (v01 OEM_ID OEMTBLID 0001 INTL 20051117)
ACPI: OEMB 0xbf6fe040 83 (v01 110410 OEMB1337 20101104 MSFT 0097)
ACPI: HPET 0xbf6fa600 38 (v01 110410 OEMHPET  20101104 MSFT 0097)
ACPI: DMAR 0xbf6fe0d0 000128 (v01AMI  OEMDMAR 0001 MSFT 0097)
ACPI: SSDT 0xbf7033c0 000363 (v01 DpgPmmCpuPm 0012 INTL 20051117)
ACPI: EINJ 0xbf6fa640 000130 (v01  AMIER AMI_EINJ 20101104 MSFT 0097)
ACPI: BERT 0xbf6fa7d0 30 (v01  AMIER AMI_BERT 20101104 MSFT 0097)
ACPI: ERST 0xbf6fa800 0001B0 (v01  AMIER AMI_ERST 20101104 MSFT 0097)
ACPI: HEST 0xbf6fa9b0 A8 (v01  AMIER ABC_HEST 20101104 MSFT 0097)
ACPI: All ACPI Tables successfully acquired
ioapic0 at mainbus0 apid 6: pa 0xfec0, version 0x20, 24 pins
ioapic1 at mainbus0 apid 7: pa 0xfec8a000, version 0x20, 24 pins
cpu0 at mainbus0 apid 0: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu1 at mainbus0 apid 2: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu2 at mainbus0 apid 4: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu3 at mainbus0 apid 16: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu4 at mainbus0 apid 18: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu5 at mainbus0 apid 20: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu6 at mainbus0 apid 32: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu7 at mainbus0 apid 34: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu8 at mainbus0 apid 36: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu9 at mainbus0 apid 48: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu10 at mainbus0 apid 50: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu11 at mainbus0 apid 52: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu12 at mainbus0 apid 1: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu13 at mainbus0 apid 3: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu14 at mainbus0 apid 5: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu15 at mainbus0 apid 17: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu16 at mainbus0 apid 19: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu17 at mainbus0 apid 21: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu18 at mainbus0 apid 33: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu19 at mainbus0 apid 35: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu20 at mainbus0 apid 37: Intel(R) Xeon(R) CPU   L5639  @ 2.13GHz, id 
0x206c2
cpu21 at mainbus0 apid 49: Intel(R) Xeon(R) CPU   L5639 

Re: Interrupt time inflation on Xen

2016-03-14 Thread Jean-Yves Migeon

Le 2016-03-14 13:40, Thor Lancelot Simon a écrit :

On Mon, Mar 14, 2016 at 12:10:29PM +0100, Jean-Yves Migeon wrote:


And same question applies if you reduce bs (128m from 1g for example)


I'll run the other test today or tomorrow -- this one's a "no" -- the
problem is just as bad with bs=64k


Alright.


(did I say I was using a 1g blocksize
before?  I wasn't, I was using a 1m blocksize).


Fair point :o When you get used to 1K blocks output...

--
Jean-Yves Migeon


Re: Interrupt time inflation on Xen

2016-03-14 Thread Thor Lancelot Simon
On Mon, Mar 14, 2016 at 12:10:29PM +0100, Jean-Yves Migeon wrote:
> 
> And same question applies if you reduce bs (128m from 1g for example)

I'll run the other test today or tomorrow -- this one's a "no" -- the
problem is just as bad with bs=64k (did I say I was using a 1g blocksize
before?  I wasn't, I was using a 1m blocksize).

-- 
  Thor Lancelot Simont...@panix.com

  "We cannot usually in social life pursue a single value or a single moral
   aim, untroubled by the need to compromise with others."  - H.L.A. Hart


Re: Interrupt time inflation on Xen

2016-03-14 Thread Johnny Billquist

On 2016-03-14 00:08, Thor Lancelot Simon wrote:

I had occasion a couple of days ago to try to block-copy a very large
filesystem from a xen dom0 to another machine across a fast local network.

I tried this:

sysctl -w kern.sbmax=1000
sysctl -w net.inet.tcp.sendbuf_auto=0
dd if=/dev/rsd0g bs=1048576 | ttcp -t -b 2097152 -l 131072 -fm -p 9001 172.0.0.1

On the far end, I had simply the corresponding ttcp -r job (with corresponding
socket/tcp sysctls in place) redirected to a file.

The underlying hardware is a RAID5 of 300GB SAS drives on an Areca 1680.
The Ethernet controller in use is a 'wm'.  All supported offload functionality
is turned on (turning it off just slows things down).

Shockingly, I saw the system go to 0% idle time, with 45-55% "Interrupt"
and the rest "System".  Interrupts per second were a comparatively low 1500,
about 500 disk and 1000 network.  Throughput was horrible -- about 35MB/sec.


This might not be related, or useful, but it do sound similar to what 
I've been seeing on VAXen for a few years now. I get unexpectedly high 
interrupt rates, but most significantly, I can get ridiculous system 
time when doing something like a "cvs update". I have never had the time 
to try and figure it out, but I normally have over 50% (usually around 
70%) in system when doing this. Which does not seem reasonable.


But once more, this might be a red herring, so feel free to ignore.

Johnny



Re: Interrupt time inflation on Xen

2016-03-14 Thread Jean-Yves Migeon

Le 2016-03-14 00:08, Thor Lancelot Simon a écrit :

[snip]
I am going to install a different disk controller and see what happens,
but I am not totally optimistic.  Does anyone have any idea what might 
be

going on?


Nope, but I would be interested in the throughput you get when you pipe 
dd if=/dev/rsd0g to /dev/null and dev/zero to ttcp (withou gzip), 
simultaneously. Do you see any difference in the disk throughput when 
stopping ttcp?


And same question applies if you reduce bs (128m from 1g for example)

--
Jean-Yves Migeon


Re: Interrupt time inflation on Xen

2016-03-13 Thread Thor Lancelot Simon
On Mon, Mar 14, 2016 at 02:53:54AM +0100, Pierre Pronchery wrote:
> On 03/14/16 00:08, Thor Lancelot Simon wrote:
> > I had occasion a couple of days ago to try to block-copy a very large
> > filesystem from a xen dom0 to another machine across a fast local network.
> > [...]
> > 
> > I am going to install a different disk controller and see what happens,
> > but I am not totally optimistic.  Does anyone have any idea what might be
> > going on?
> 
> Try this pull-up:
> https://mail-index.netbsd.org/source-changes/2015/11/16/msg070367.html

Looks unlikely -- why would this impact what I am doing strictly on the
dom0, with no guests running?  Dom0 does not access its disks via
xbdback.

Thor


Interrupt time inflation on Xen

2016-03-13 Thread Thor Lancelot Simon
I had occasion a couple of days ago to try to block-copy a very large
filesystem from a xen dom0 to another machine across a fast local network.

I tried this:

sysctl -w kern.sbmax=1000
sysctl -w net.inet.tcp.sendbuf_auto=0
dd if=/dev/rsd0g bs=1048576 | ttcp -t -b 2097152 -l 131072 -fm -p 9001 172.0.0.1

On the far end, I had simply the corresponding ttcp -r job (with corresponding
socket/tcp sysctls in place) redirected to a file.

The underlying hardware is a RAID5 of 300GB SAS drives on an Areca 1680.
The Ethernet controller in use is a 'wm'.  All supported offload functionality
is turned on (turning it off just slows things down).

Shockingly, I saw the system go to 0% idle time, with 45-55% "Interrupt"
and the rest "System".  Interrupts per second were a comparatively low 1500,
about 500 disk and 1000 network.  Throughput was horrible -- about 35MB/sec.

I rebooted the system to a GENERIC kernel; throughput was 100MB/sec and
peaked at 200MB/sec when I piped the data through gzip -1.  The system
was mostly idle with under 2% "Interrupt" time.

Interrupts/sec were significantly higher due to both the higher data rate
and about 800 TLB shootdown interrupts/sec (the GENERIC kernel runs MP
on this system).

Kernels are both netbsd-7 branch from just around the time of the release,
XEN3_DOM0 and GENERIC.  Neither kernel has DEBUG nor DIAGNOSTIC nor KMEMSTATS.

Do we have some kind of terrible problem that causes a 20% increase in
interrupt time when running under Xen?  Note the Areca driver (arcmsr) is
a SCSI driver so it is giant-locked, but the XEN dom0 is UP, so this is
probably a red herring.

I am going to install a different disk controller and see what happens,
but I am not totally optimistic.  Does anyone have any idea what might be
going on?

-- 
  Thor Lancelot Simont...@panix.com

  "We cannot usually in social life pursue a single value or a single moral
   aim, untroubled by the need to compromise with others."  - H.L.A. Hart