Re: XEN3_DOMU no longer shutting down or rebooting

2019-02-19 Thread Cherry G . Mathew
Chavdar Ivanov  writes:

> Yes, it is. This happens on:
> ...
>  uname -a
> NetBSD nbuild.lorien.lan 8.99.34 NetBSD 8.99.34 (XEN3_DOMU) #0: Tue
> Feb 19 13:23:54 GMT 2019
> sysbu...@nbuild.lorien.lan:/home/sysbuild/amd64/obj/home/sysbuild/src/sys/arch/amd64/compile/XEN3_DOMU
> amd64
> 
>

Thank you - I will have a closer look over the weekend.
-- 
~cherry


Re: XEN3_DOMU no longer shutting down or rebooting

2019-02-19 Thread Cherry G. Mathew
On 19 February 2019 11:31:37 PM MYT, Martin Husemann  wrote:
>On Tue, Feb 19, 2019 at 03:28:23PM +, Chavdar Ivanov wrote:
>> Any bells ringing?
>
>I think this already has been fixed by Cherry?
>
>Martin

I believe so. 

Chavdar, I'd be interested to know if this is happening in -current.

Thanks

Cherry
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: current fails to boot on VirtualBox

2019-02-16 Thread Cherry G . Mathew
Chavdar Ivanov  writes:

> Mine is 8.99.34 from 09/02/2019, self build, works just fine. I'll
> build a new kernel now to compare.
>

Just FYI in case this is relevant and can help you with bisecting:

https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=53984


-- 
~cherry


Re: Automated report: NetBSD-current/i386 build failure

2018-12-25 Thread Cherry G . Mathew
NetBSD Test Fixture  writes:

> This is an automatically generated notice of a NetBSD-current/i386
> build failure.
>
> The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host,
> using sources from CVS date 2018.12.25.07.41.21.
>
> An extract from the build.sh output follows:
>
>pmap_kremove(istack, PAGE_SIZE);
>^~~~
> 
> /tmp/bracket/build/2018.12.25.07.41.21-i386/src/sys/arch/xen/x86/xen_intr.c:327:3:
>  error: implicit declaration of function 'pmap_update' 
> [-Werror=implicit-function-declaration]
>pmap_update(pmap_kernel());
>^~~
> 
> /tmp/bracket/build/2018.12.25.07.41.21-i386/src/sys/arch/xen/x86/xen_intr.c:327:15:
>  error: implicit declaration of function 'pmap_kernel' 
> [-Werror=implicit-function-declaration]
>pmap_update(pmap_kernel());
>^~~
> cc1: all warnings being treated as errors
> *** [xen_intr.o] Error code 1
> nbmake[2]: stopped in 
> /tmp/bracket/build/2018.12.25.07.41.21-i386/obj/sys/arch/i386/compile/INSTALL_XEN3PAE_DOMU
> --- kern-INSTALL ---
>
> The following commits were made between the last successful build and
> the failed build:
>
> 2018.12.25.06.50.11 cherry src/sys/arch/amd64/amd64/genassym.cf,v 1.71
> 2018.12.25.06.50.11 cherry src/sys/arch/amd64/amd64/lock_stubs.S,v 1.30
> 2018.12.25.06.50.11 cherry src/sys/arch/amd64/amd64/spl.S,v 1.37
> 2018.12.25.06.50.11 cherry src/sys/arch/amd64/amd64/vector.S,v 1.65
> 2018.12.25.06.50.11 cherry src/sys/arch/i386/i386/genassym.cf,v 1.108
> 2018.12.25.06.50.11 cherry src/sys/arch/i386/i386/spl.S,v 1.44
> 2018.12.25.06.50.11 cherry src/sys/arch/i386/i386/vector.S,v 1.79
> 2018.12.25.06.50.11 cherry src/sys/arch/x86/include/cpu.h,v 1.101
> 2018.12.25.06.50.12 cherry src/sys/arch/x86/isa/isa_machdep.c,v 1.43
> 2018.12.25.06.50.12 cherry src/sys/arch/x86/x86/i8259.c,v 1.22
> 2018.12.25.06.50.12 cherry src/sys/arch/x86/x86/intr.c,v 1.141
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/conf/files.xen,v 1.174
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/include/intr.h,v 1.51
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/x86/hypervisor_machdep.c,v 
> 1.34
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/x86/xen_intr.c,v 1.11
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/xen/clock.c,v 1.76
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/xen/evtchn.c,v 1.83
> 2018.12.25.06.50.12 cherry src/sys/arch/xen/xen/xenevt.c,v 1.53
> 2018.12.25.07.41.21 msaitoh src/sys/dev/pci/if_wmvar.h,v 1.42
>
> Log files can be found at:
>
> 
> http://releng.NetBSD.org/b5reports/i386/commits-2018.12.html#2018.12.25.07.41.21

This should be fixed now via:
http://mail-index.netbsd.org/source-changes/2018/12/25/msg101698.html

Sorry for the breakage.

Not so Happy Christmas!

-- 
~cherry


Re: Panic on a -current from 13/12/2018

2018-12-21 Thread Cherry G. Mathew
On December 22, 2018 2:24:44 AM GMT+05:30, Chavdar Ivanov  
wrote:
...
>
>It is interesting also that when NetBSD is ran under XenServer (XCP-NG
>actually) in PV mode, benchmarked against the same 8.99.28 version
>running on a physical machine, everything on a 1GB interface and
>switch, I get maximum saturated line (~ 933Mb/s). When the iperf3
>server is on the same XCP-BG guest and the client - a CentOS guest -
>the figures approach 2.3Gb/sec.
>

Do you have jumbo frames on on the centos VM? 

Thanks,



Re: MSI/MSI-X implementation and interrupt handling on i386/amd64

2018-12-06 Thread Cherry G . Mathew
Hi Geoff,

Saitoh-san pointed me at this email. I've been looking at MSI briefly -
should have some work in place to sort out this situation. About your
specific situation:

Geoff Wing  writes:

> Hi,
> brief background:  on an amd64 VM (1 CPU on VMWare ESXi) I had a network
> interface (vmx) failing because it could not get an interrupt slot.  The
> vmx wants 3 interrupts per interface (tx/rx/link-state).  I had a few
> on an admin machine and one started failing when ahcisata was changed to
> use MSI (not ahcisata's fault, obviously).
>
> On i386/amd64 each CPU has a 32 bitmask for interrupts (1 bit per) - but
> 16 of the 32 are reserved for legacy IRQs (on the first CPU).  MSI-X allows
> for 2048 interrupts.  On a physical machine with many CPUs the MSI interrupts
> are farmed out across the different CPUs so would not be apparent to most.
> (and no problem for those 65+ core machines).
>
> For my personal use, I've hacked around by ignoring the reserved legacy IRQ
> region because it's not relevant to me in my VM with MSI/MSI-X.  Other
> people using single CPU VMs may start bumping into this issue so just
> making people aware.  I haven't looked into changing how interrupts are
> handled or if there would be significant performance penalty.
>

You could have a stopgap fix by just using a 64 bit mask and equivalent
supporting data structures instead of the 32bit one. You'll probably
need to also look at spl.S assembler primitives that access the pending
bitmask via assembler instructions and teach them how to do this on a
64bit pending string.

The right thing to do is to stop using a bit mask entirely, and using
a bit more scalable Data structure for this. This isn't trivial though -
the assembler stuff will be harder to maintain correctness than a
straightup buslocked bitscan/compare etc. 

In any case, I'll report back when I get around to this.

Many Thanks,
-- 
~cherry


Re: Panic with recent -current with interrupt setup

2018-10-09 Thread Cherry G. Mathew
Andreas Gustafsson  writes:

> Cherry G.Mathew wrote:
>> Thank you - I've checked in a temporary 'fix'. This will sort itself out
>> once the interrupt rework to merge with native is complete.
>
> A dom0 built from 2018.10.07.20.30.50 boots successfully.  Thank you!

That's great news! There's some really intrusive stuff coming next, so
things are really going to break :-)

I'll probably check them in around Wednesday midday UTC.

Cheers,
-- 
~cherry



Re: Panic with recent -current with interrupt setup

2018-10-09 Thread Cherry G. Mathew
"Cherry G. Mathew"  writes:

> Andreas Gustafsson  writes:
>
>> Brad Spencer wrote:
>>> Just wondering if anyone else has seen this, but I am getting panics on
>>> boot during probe with sources after 2018-09-23 [at some point, at least
>>> 2018-09-29 and 2018-10-01 panic, but 2018-09-23 doesn't].  This is with
>>> trying to use the stock XEN3_DOM0 kernel on a new system I am setting
>>> up.  The panics seem related to setting up interrupts or printing
>>> interrupt information in the intel wm(4) driver.  The system in question
>>> does not have a serial port on it in any form, but I can probably
>>> capture a screen shot of the panic.  The keyboard works and ddb seems
>>> usable.
>>
>> I ran an automted test of -current from CVS source date
>> 2018.10.01.17.50.08, and can confirm that it also panics
>> on my HP DL360 G7, under both Xen 4.8 and 4.11.  Logs at:
>>
>>   http://www.gson.org/netbsd/bugs/xen/results/2018-10-02/index.html
>
> Thanks Andreas, Brad. I'm aware of this problem and the fix (msaitoh@
> tried and confirmed it works), and will sort it out as soon as I'm
> confident it's the right approach.
>
> Sorry for the breakage.

I have checked in a couple of changes, which hopefully should fix the
problem. I look forward to user reports.

Many Thanks,
-- 
~cherry



Re: Panic with recent -current with interrupt setup

2018-10-07 Thread Cherry G . Mathew
Andreas Gustafsson  writes:

> Cherry G. Mathew wrote:
>> I have checked in a couple of changes, which hopefully should fix the
>> problem. I look forward to user reports.
>
> With sources from CVS date 2018.10.06.16.49.54 (that's up to
> and including your commit of xen/x86/pintr.c 1.6), I get:
>
> [   1.030] bnx0 at pci7 dev 0 function 0: Broadcom NetXtreme II BCM5709 
> 1000Base-T
> [   1.030] bnx0: Ethernet address 98:4b:e1:67:68:98
> (XEN) irq.c:1943: dom0: pirq 10 or irq 30 already mapped (0,28)
> [ 1.030] panic: kernel diagnostic assertion "irq2port[irq] == 0"
> failed: file
> "/tmp/bracket/build/2018.10.06.16.49.54-amd64/src/sys/arch/x86/x86/ioapic.c",
> line 583
>
> The full log is at:
>
>   
> http://www.gson.org/netbsd/bugs/xen/results/2018-10-07/data-411-current-2018.10.06.16.49.54-amd64/clean.txt

Thank you - I've checked in a temporary 'fix'. This will sort itself out
once the interrupt rework to merge with native is complete.

I hope it works,

Many Thanks,
-- 
~cherry


Re: Panic with recent -current with interrupt setup

2018-10-04 Thread Cherry G. Mathew
Andreas Gustafsson  writes:

> Brad Spencer wrote:
>> Just wondering if anyone else has seen this, but I am getting panics on
>> boot during probe with sources after 2018-09-23 [at some point, at least
>> 2018-09-29 and 2018-10-01 panic, but 2018-09-23 doesn't].  This is with
>> trying to use the stock XEN3_DOM0 kernel on a new system I am setting
>> up.  The panics seem related to setting up interrupts or printing
>> interrupt information in the intel wm(4) driver.  The system in question
>> does not have a serial port on it in any form, but I can probably
>> capture a screen shot of the panic.  The keyboard works and ddb seems
>> usable.
>
> I ran an automted test of -current from CVS source date
> 2018.10.01.17.50.08, and can confirm that it also panics
> on my HP DL360 G7, under both Xen 4.8 and 4.11.  Logs at:
>
>   http://www.gson.org/netbsd/bugs/xen/results/2018-10-02/index.html

Thanks Andreas, Brad. I'm aware of this problem and the fix (msaitoh@
tried and confirmed it works), and will sort it out as soon as I'm
confident it's the right approach.

Sorry for the breakage.
-- 
~cherry



Re: Automated report: NetBSD-current/i386 build failure

2018-09-23 Thread Cherry G . Mathew
Robert Elz  writes:

> Date:Sun, 23 Sep 2018 13:25:23 +0530
> From:"Cherry G.Mathew" 
> Message-ID:  <87efdkfy1g@zyx.in>
>
>   | Should be fixed now.
>  
> Not yet,...
>

Thanks for the alert. I have tested the build now it should be fixed. 

I would look forward to boot tests to make sure that nothing has changed
functionally (it shouldn't).

Thanks,
-- 
~cherry


Re: Automated report: NetBSD-current/i386 build failure

2018-09-23 Thread Cherry G . Mathew
Andreas Gustafsson  writes:

> Cherry,
>
> The NetBSD Test Fixture wrote:
>> --- assym.h ---
>> *** [assym.h] Error code 1
>> nbmake[2]: stopped in 
>> /tmp/bracket/build/2018.09.23.02.51.06-i386/obj/sys/arch/i386/compile/INSTALL_XEN3PAE_DOMU
>
> Here's a more relevant part of the build log:
>

[...]

> /tmp/bracket/build/2018.09.23.02.51.06-i386/obj/sys/arch/i386/compile/INSTALL_XEN3PAE_DOMU/xen-ma/machine/segments.h:201:32:
>  error: conflicting types for 'idt'
>  extern struct gate_descriptor *idt;

[...]

> /tmp/bracket/build/2018.09.23.02.51.06-i386/obj/sys/arch/i386/compile/INSTALL_XEN3PAE_DOMU/xen-ma/machine/segments.h:199:26:
>  note: previous declaration of 'idt' was here
>  extern idt_descriptor_t *idt;
>   ^~~

Sorry for the breakage - I didn't have a build environment for i386
handy.

Should be fixed now.

-- 
~cherry