Re: Getting the name of a kernel thread?

2021-04-15 Thread Martin Husemann
On Thu, Apr 15, 2021 at 01:39:23PM -0700, Brian Buhrow wrote:
>   hello.  I just got two panics on a system running NetBSD-9.99.77/amd64. 
>  Unfortunately, I
> didn't have enough swap configured to capture a dump file.  However, I did 
> figure out that the
> problem is in  thread 6 of the kernel process, process 0.  However,  I am 
> having trouble
> figuring out what the name of thread 6  is.  Is there a key word I can use 
> with ps(1) to see
> the name of a particular thread?  I thought I might be able to do it with the 
> wchan field, but
> for this particular thread, the wchan field is "-".  If I can figure out what 
> thread 6 is on
> this system, it might give me a clue as to where the problem might be.  The 
> problem is hard to
> reproduce, so any data I can get from the running system would be helpful.

I'd use crash to get the "life" ddb experience:

> crash
crash> ps
PIDLID S CPU FLAGS   STRUCT LWP *   NAME WAIT
[..]
0   10 3   0   200  106ea84c0   nfssilly nfssilly
09 3   0   240  106ea8080 vdrain vdrain
08 3   0   240  106e979c0  modunload mod_unld
07 3   0   200  106e97580xcall/0 xcall
06 1   0   200  106e97140  softser/0
05 1   0   200  106e96d00  softclk/0
04 1   0   200  106e968c0  softbio/0
03 1   0   200  106e96480  softnet/0
02 1   0   201  106e96040 idle/0
00 3   0   2001c5e100swapper uvm

Martin


daily CVS update output

2021-04-15 Thread NetBSD source update


Updating src tree:
P src/distrib/sets/lists/base/mi
P src/distrib/sets/lists/tests/mi
P src/external/cddl/osnet/sys/sys/opentypes.h
P src/external/cddl/osnet/sys/sys/vnode.h
P src/external/gpl3/gcc/README.gcc10
U src/external/gpl3/gcc/README.warnings
P src/external/gpl3/gcc/dist/gcc/config/rs6000/rs6000.c
P src/sys/arch/alpha/include/cpu.h
P src/sys/arch/hp300/dev/diofb.c
P src/sys/arch/hp300/dev/topcat.c
P src/sys/arch/m68k/m68k/pmap_motorola.c
P src/sys/dev/pci/if_aq.c
P src/sys/modules/solaris/Makefile.solmod
P src/sys/modules/zfs/Makefile.zfsmod
P src/sys/net/if_pppoe.c
P src/sys/net/if_spppsubr.c
P src/sys/net/if_spppvar.h
P src/sys/rump/fs/lib/libzfs/Makefile
P src/sys/rump/kern/lib/libsolaris/Makefile
P src/usr.bin/make/job.c
P src/usr.bin/make/unit-tests/Makefile
U src/usr.bin/make/unit-tests/job-output-null.exp
U src/usr.bin/make/unit-tests/job-output-null.mk

Updating xsrc tree:


Killing core files:




Updating file list:
-rw-rw-r--  1 srcmastr  netbsd  38294689 Apr 16 03:03 ls-lRA.gz


Possible problem with com(4) at 115200 baud when 16550 has only 1 byte in its fifo?

2021-04-15 Thread Brian Buhrow
Hello. It looks like there is a problem in the comsoft() routine in
sys/dev/ic/com.c.   When a panic occurred, I was using com0 on the machine in 
question, and
the port was sending and receiving data at a baud rate of 115200 
simultaneously.  It's been a
long time since I touched this com.c code, but it looks to me like comsoft() 
doesn't use the
mutex cominter() uses to ensure exclusive access.  My question is, what happens 
if cominter()
fires, it does its thing, launches comsoft() and, before comsoft() finishes, 
cominter() fires
again?  The serial port on this machine has a fifo of 1 byte, so interrupts can 
com in pretty
fast when it's receiving at 115200 baud.  

On the machine in question, an NetBSD-99.77/amd64 device with a 1-byte fifo 
16550 compatible
serial chip, I was able to reproduce this panic twice in just a few minutes of 
each other.

-thanks
-Brian




Getting the name of a kernel thread?

2021-04-15 Thread Brian Buhrow
hello.  I just got two panics on a system running NetBSD-9.99.77/amd64. 
 Unfortunately, I
didn't have enough swap configured to capture a dump file.  However, I did 
figure out that the
problem is in  thread 6 of the kernel process, process 0.  However,  I am 
having trouble
figuring out what the name of thread 6  is.  Is there a key word I can use with 
ps(1) to see
the name of a particular thread?  I thought I might be able to do it with the 
wchan field, but
for this particular thread, the wchan field is "-".  If I can figure out what 
thread 6 is on
this system, it might give me a clue as to where the problem might be.  The 
problem is hard to
reproduce, so any data I can get from the running system would be helpful.

-thanks
-Brian



Re: running xen on current

2021-04-15 Thread Greg A. Woods
At Thu, 15 Apr 2021 13:02:54 +0200, Manuel Bouyer  
wrote:
Subject: Re: running xen on current
>
> AFAIK EFI is not yet supported by Xen (maybe this is supported by 4.15,
> I've not had a chance to try yet). I have it running on fairly recent
> Dell servers (in BIOS mode)

My Dell servers, even the newer PE-R510, are much older I think  :-)

They run -current (2021-03-10) quite well (except for PR# 54969 -- I
have to remember to unmount my larger filesystems manually before any
reboot unless I want to risk loss and/or wait a long time for fscks -- I
haven't turned on '-o log' for them yet as I wanted to measure its
performance impact).

My XEN3_DOM0 kernel is somewhat customized, but not in any way that
should affect the hardware support or Xen -- of interest might be iscsi
support and and VND_COMPRESSION, but I haven't tried testing either yet.

I did read about the unified EFI image support in Xen 4.15 and I was
thinking of trying it on my old MacBookPro -- but I would also want X11
to work on it too, and even FreeBSD's Xserver wasn't working on it last
summer, so I went back to MacOS in order to be able to use it for web
and such as well as remote access.

--
Greg A. Woods 

Kelowna, BC +1 250 762-7675   RoboHack 
Planix, Inc.  Avoncote Farms 


pgpW51GFrekcb.pgp
Description: OpenPGP Digital Signature


Re: running xen on current

2021-04-15 Thread Brian Buhrow
hello.  The difference between UEFI and legacy booting is significant.  
I'm not sure about
the current state of NetBSD and xen-dom0, but with FreeBSD, legacy booting is 
required unless
you're running 13-current.  I think NetBSD/xen-dom0 supports UEFI booting, but 
it requires you
use multiboot mode instead of the standard NetBSD boot mode.  In addition, I 
think you need to
be running a pretty recent -current, i.e. something since January 1 2021.
If you can boot your systems in legacy mode, however, NetBSD-9.x/Xen 
works very well,
except in conjunction with zfs.

Hope that helps.
-Brian

On Apr 15,  9:53am, Patrick Welche wrote:
} Subject: running xen on current
} I have tried and failed to run xen on 3 -current/amd64 systems with
} 3 different failure modes:
} 
} 1) laptop:  xen.gz Building a PV Dom0 / ELF: not an ELF binary -> panic/reboot
} 2) desktop: XEN3_DOM0 panics including PR port-xen/55978
} 3) server:  Trampoline space cannot be allocated; will try fallback -> reboot
} 
} They are all working NetBSD-current/amd64 systems.
} 
} My conclusion was that xen is hopelessly broken, so was quite surprised
} by Greg Wood's thread about the finer points of running a guest OS, given
} that those systems won't even start the host OS.
} 
} I dug out an old desktop, and to my pleasant surprise it booted XEN3_DOM0,
} and I have managed to run some XEN3_DOMUs.
} 
} The difference between the working/broken setups seems to be that the
} working one is "BIOS" booting rather than EFI booting.
} 
} Among all your xen success stories, are any of you EFI booting?
} 
} 
} Cheers,
} 
} Patrick
} 
} 
} =
} 
} Some extra gory details
} 
} 1) laptop:
} 
}  Building a PV Dom0 
} ELF: Not an ELF binary
} 
} ***
} Panic on CPU 0:
} Could not set up DOM0 guest OS
} ***
} 
} Reboot in five seconds...
} 
} 
} 2) desktop: selection of panics in addition to PR port-xen/55978
} 
} 
} [  80.989] panic: LIST_INSERT_HEAD 0xa080073eec28 
../../../../arch/x86/x86/pmap.c:2285
} [  80.989] cpu13: Begin traceback...
} [  80.989] vpanic() at netbsd:vpanic+0x14a
} [  80.989] snprintf() at netbsd:snprintf
} [  80.989] pmap_enter_ma() at netbsd:pmap_enter_ma+0x14e7
} [  80.989] pmap_enter() at netbsd:pmap_enter+0x32
} [  80.989] udv_fault() at netbsd:udv_fault+0x100
} [  80.989] uvm_fault_internal() at netbsd:uvm_fault_internal+0x574
} [  80.989] trap() at netbsd:trap+0x432
} [  80.989] --- trap (number 6) ---
} [  80.989] 7a60617787af:
} [  80.989] cpu13: End traceback...
} 
} [  75.6599981] panic: kernel diagnostic assertion "ncp->nc_dvp == dvp" 
failed: file "../../../../kern/vfs_cache.c", line 432 
} [  75.6599981] cpu0: Begin traceback...
} [  75.6599981] vpanic() at netbsd:vpanic+0x14a
} [  75.6599981] kern_assert() at netbsd:kern_assert+0x48
} [  75.6599981] cache_lookup_entry() at netbsd:cache_lookup_entry+0xde
} [  75.6599981] cache_lookup_linked() at netbsd:cache_lookup_linked+0x160
} [  75.6599981] namei_tryemulroot() at netbsd:namei_tryemulroot+0x298
} [  75.6599981] namei() at netbsd:namei+0x29
} [  75.6599981] vn_open() at netbsd:vn_open+0x8f
} [  75.6599981] do_open() at netbsd:do_open+0x119
} [  75.6599981] do_sys_openat() at netbsd:do_sys_openat+0x74
} [  75.6599981] sys_open() at netbsd:sys_open+0x24
} [  75.6599981] syscall() at netbsd:syscall+0x9c
} [  75.6599981] --- syscall (number 5) ---
} [  75.6599981] netbsd:syscall+0x9c:
} [  75.6599981] cpu0: End traceback...
} 
} 
} 3) server: EFI boot of Feb 6 2021, xenkernel413-4.13.3.tgz, serial console
} 
} On serial console, all that is seen is:
} 
} 2415648+1324000=0x3910ec 
} Loading /var/db/entropy-file
} Loading /netbsd-XEN3_DOM0
} Start @ 0xce60 [1=0xce991000-0xce9910ec]... 
} Trampoline space cannot be allocated; will try fallback.
} 
} then it reboots
>-- End of excerpt from Patrick Welche




Re: running xen on current

2021-04-15 Thread Manuel Bouyer
On Thu, Apr 15, 2021 at 01:39:37PM +0100, Patrick Welche wrote:
> On Thu, Apr 15, 2021 at 07:28:32AM -0400, Brad Spencer wrote:
> > Manuel Bouyer  writes:
> > 
> > > On Thu, Apr 15, 2021 at 09:53:50AM +0100, Patrick Welche wrote:
> > >> I have tried and failed to run xen on 3 -current/amd64 systems with
> > >> 3 different failure modes:
> > >> 
> > >> 1) laptop:  xen.gz Building a PV Dom0 / ELF: not an ELF binary -> 
> > >> panic/reboot
> > >> 2) desktop: XEN3_DOM0 panics including PR port-xen/55978
> > >> 3) server:  Trampoline space cannot be allocated; will try fallback -> 
> > >> reboot
> > >> 
> > >> They are all working NetBSD-current/amd64 systems.
> > >> 
> > >> My conclusion was that xen is hopelessly broken, so was quite surprised
> > >> by Greg Wood's thread about the finer points of running a guest OS, given
> > >> that those systems won't even start the host OS.
> > >> 
> > >> I dug out an old desktop, and to my pleasant surprise it booted 
> > >> XEN3_DOM0,
> > >> and I have managed to run some XEN3_DOMUs.
> > >> 
> > >> The difference between the working/broken setups seems to be that the
> > >> working one is "BIOS" booting rather than EFI booting.
> > >> 
> > >> Among all your xen success stories, are any of you EFI booting?
> > >
> > > AFAIK EFI is not yet supported by Xen (maybe this is supported by 4.15,
> > > I've not had a chance to try yet). I have it running on fairly recent
> > > Dell servers (in BIOS mode)
> > 
> > 
> > There has been fiddling with Xen and EFI for quite some time.  See:
> > 
> > https://wiki.xenproject.org/wiki/Xen_EFI
> > 
> > for example (might be old)... this indicates that Xen 4.3 or later could
> > be built as a EFI binary and probably booted from the EFI firmware
> > directly or with grub2 when grub2 is a EFI binary itself.  Of course
> > those instructions are all Linux-centric and I don't know if you created
> > a Xen kernel like this if it would boot a NetBSD DOM0 kernel.  I am in
> > no position to try any tests with this right now personally, but it is
> > tempting as I have a EFI only laptop that I could probably replace the
> > hard drive temporarily.
> 
> Looking at
> 
>   https://xenproject.org/2021/04/08/xen-project-hypervisor-4-15/
> 
> (so 4.15 only just came out!) I see
> 
>   Unified boot images: It is now possible to create an image bundling
>   together files needed for Xen to boot into a single EFI binary;
>   making it now possible to boot a functional Xen system directly
>   from the EFI boot manager, rather than having to go through grub
>   multiboot.  Files that can be bundled include a hypervisor, dom0
>   kernel, dom0 initrd, Xen KConfig, XSM configuration, and a device
>   tree.
> 
> I thought that "go through grub multiboot" was the equivalent of our
> boot.cfg "multiboot /xen.gz dom0_mem=1024M", but apparently not?

It should be; but there are probably differences between BIOS and EFI, even
when using multiboot (the way to access the console, or find the ACPI
tables, may be different, for example)

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: running xen on current

2021-04-15 Thread Patrick Welche
On Thu, Apr 15, 2021 at 07:28:32AM -0400, Brad Spencer wrote:
> Manuel Bouyer  writes:
> 
> > On Thu, Apr 15, 2021 at 09:53:50AM +0100, Patrick Welche wrote:
> >> I have tried and failed to run xen on 3 -current/amd64 systems with
> >> 3 different failure modes:
> >> 
> >> 1) laptop:  xen.gz Building a PV Dom0 / ELF: not an ELF binary -> 
> >> panic/reboot
> >> 2) desktop: XEN3_DOM0 panics including PR port-xen/55978
> >> 3) server:  Trampoline space cannot be allocated; will try fallback -> 
> >> reboot
> >> 
> >> They are all working NetBSD-current/amd64 systems.
> >> 
> >> My conclusion was that xen is hopelessly broken, so was quite surprised
> >> by Greg Wood's thread about the finer points of running a guest OS, given
> >> that those systems won't even start the host OS.
> >> 
> >> I dug out an old desktop, and to my pleasant surprise it booted XEN3_DOM0,
> >> and I have managed to run some XEN3_DOMUs.
> >> 
> >> The difference between the working/broken setups seems to be that the
> >> working one is "BIOS" booting rather than EFI booting.
> >> 
> >> Among all your xen success stories, are any of you EFI booting?
> >
> > AFAIK EFI is not yet supported by Xen (maybe this is supported by 4.15,
> > I've not had a chance to try yet). I have it running on fairly recent
> > Dell servers (in BIOS mode)
> 
> 
> There has been fiddling with Xen and EFI for quite some time.  See:
> 
> https://wiki.xenproject.org/wiki/Xen_EFI
> 
> for example (might be old)... this indicates that Xen 4.3 or later could
> be built as a EFI binary and probably booted from the EFI firmware
> directly or with grub2 when grub2 is a EFI binary itself.  Of course
> those instructions are all Linux-centric and I don't know if you created
> a Xen kernel like this if it would boot a NetBSD DOM0 kernel.  I am in
> no position to try any tests with this right now personally, but it is
> tempting as I have a EFI only laptop that I could probably replace the
> hard drive temporarily.

Looking at

  https://xenproject.org/2021/04/08/xen-project-hypervisor-4-15/

(so 4.15 only just came out!) I see

  Unified boot images: It is now possible to create an image bundling
  together files needed for Xen to boot into a single EFI binary;
  making it now possible to boot a functional Xen system directly
  from the EFI boot manager, rather than having to go through grub
  multiboot.  Files that can be bundled include a hypervisor, dom0
  kernel, dom0 initrd, Xen KConfig, XSM configuration, and a device
  tree.

I thought that "go through grub multiboot" was the equivalent of our
boot.cfg "multiboot /xen.gz dom0_mem=1024M", but apparently not?
(Seems different to booting straight from the EFI boot menu)


Cheers,

Patrick


Re: running xen on current

2021-04-15 Thread Brad Spencer
Manuel Bouyer  writes:

> On Thu, Apr 15, 2021 at 09:53:50AM +0100, Patrick Welche wrote:
>> I have tried and failed to run xen on 3 -current/amd64 systems with
>> 3 different failure modes:
>> 
>> 1) laptop:  xen.gz Building a PV Dom0 / ELF: not an ELF binary -> 
>> panic/reboot
>> 2) desktop: XEN3_DOM0 panics including PR port-xen/55978
>> 3) server:  Trampoline space cannot be allocated; will try fallback -> reboot
>> 
>> They are all working NetBSD-current/amd64 systems.
>> 
>> My conclusion was that xen is hopelessly broken, so was quite surprised
>> by Greg Wood's thread about the finer points of running a guest OS, given
>> that those systems won't even start the host OS.
>> 
>> I dug out an old desktop, and to my pleasant surprise it booted XEN3_DOM0,
>> and I have managed to run some XEN3_DOMUs.
>> 
>> The difference between the working/broken setups seems to be that the
>> working one is "BIOS" booting rather than EFI booting.
>> 
>> Among all your xen success stories, are any of you EFI booting?
>
> AFAIK EFI is not yet supported by Xen (maybe this is supported by 4.15,
> I've not had a chance to try yet). I have it running on fairly recent
> Dell servers (in BIOS mode)


There has been fiddling with Xen and EFI for quite some time.  See:

https://wiki.xenproject.org/wiki/Xen_EFI

for example (might be old)... this indicates that Xen 4.3 or later could
be built as a EFI binary and probably booted from the EFI firmware
directly or with grub2 when grub2 is a EFI binary itself.  Of course
those instructions are all Linux-centric and I don't know if you created
a Xen kernel like this if it would boot a NetBSD DOM0 kernel.  I am in
no position to try any tests with this right now personally, but it is
tempting as I have a EFI only laptop that I could probably replace the
hard drive temporarily.



-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


Re: running xen on current

2021-04-15 Thread Manuel Bouyer
On Thu, Apr 15, 2021 at 09:53:50AM +0100, Patrick Welche wrote:
> I have tried and failed to run xen on 3 -current/amd64 systems with
> 3 different failure modes:
> 
> 1) laptop:  xen.gz Building a PV Dom0 / ELF: not an ELF binary -> panic/reboot
> 2) desktop: XEN3_DOM0 panics including PR port-xen/55978
> 3) server:  Trampoline space cannot be allocated; will try fallback -> reboot
> 
> They are all working NetBSD-current/amd64 systems.
> 
> My conclusion was that xen is hopelessly broken, so was quite surprised
> by Greg Wood's thread about the finer points of running a guest OS, given
> that those systems won't even start the host OS.
> 
> I dug out an old desktop, and to my pleasant surprise it booted XEN3_DOM0,
> and I have managed to run some XEN3_DOMUs.
> 
> The difference between the working/broken setups seems to be that the
> working one is "BIOS" booting rather than EFI booting.
> 
> Among all your xen success stories, are any of you EFI booting?

AFAIK EFI is not yet supported by Xen (maybe this is supported by 4.15,
I've not had a chance to try yet). I have it running on fairly recent
Dell servers (in BIOS mode)

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


running xen on current

2021-04-15 Thread Patrick Welche
I have tried and failed to run xen on 3 -current/amd64 systems with
3 different failure modes:

1) laptop:  xen.gz Building a PV Dom0 / ELF: not an ELF binary -> panic/reboot
2) desktop: XEN3_DOM0 panics including PR port-xen/55978
3) server:  Trampoline space cannot be allocated; will try fallback -> reboot

They are all working NetBSD-current/amd64 systems.

My conclusion was that xen is hopelessly broken, so was quite surprised
by Greg Wood's thread about the finer points of running a guest OS, given
that those systems won't even start the host OS.

I dug out an old desktop, and to my pleasant surprise it booted XEN3_DOM0,
and I have managed to run some XEN3_DOMUs.

The difference between the working/broken setups seems to be that the
working one is "BIOS" booting rather than EFI booting.

Among all your xen success stories, are any of you EFI booting?


Cheers,

Patrick


=

Some extra gory details

1) laptop:

 Building a PV Dom0 
ELF: Not an ELF binary

***
Panic on CPU 0:
Could not set up DOM0 guest OS
***

Reboot in five seconds...


2) desktop: selection of panics in addition to PR port-xen/55978


[  80.989] panic: LIST_INSERT_HEAD 0xa080073eec28 
../../../../arch/x86/x86/pmap.c:2285
[  80.989] cpu13: Begin traceback...
[  80.989] vpanic() at netbsd:vpanic+0x14a
[  80.989] snprintf() at netbsd:snprintf
[  80.989] pmap_enter_ma() at netbsd:pmap_enter_ma+0x14e7
[  80.989] pmap_enter() at netbsd:pmap_enter+0x32
[  80.989] udv_fault() at netbsd:udv_fault+0x100
[  80.989] uvm_fault_internal() at netbsd:uvm_fault_internal+0x574
[  80.989] trap() at netbsd:trap+0x432
[  80.989] --- trap (number 6) ---
[  80.989] 7a60617787af:
[  80.989] cpu13: End traceback...

[  75.6599981] panic: kernel diagnostic assertion "ncp->nc_dvp == dvp" failed: 
file "../../../../kern/vfs_cache.c", line 432 
[  75.6599981] cpu0: Begin traceback...
[  75.6599981] vpanic() at netbsd:vpanic+0x14a
[  75.6599981] kern_assert() at netbsd:kern_assert+0x48
[  75.6599981] cache_lookup_entry() at netbsd:cache_lookup_entry+0xde
[  75.6599981] cache_lookup_linked() at netbsd:cache_lookup_linked+0x160
[  75.6599981] namei_tryemulroot() at netbsd:namei_tryemulroot+0x298
[  75.6599981] namei() at netbsd:namei+0x29
[  75.6599981] vn_open() at netbsd:vn_open+0x8f
[  75.6599981] do_open() at netbsd:do_open+0x119
[  75.6599981] do_sys_openat() at netbsd:do_sys_openat+0x74
[  75.6599981] sys_open() at netbsd:sys_open+0x24
[  75.6599981] syscall() at netbsd:syscall+0x9c
[  75.6599981] --- syscall (number 5) ---
[  75.6599981] netbsd:syscall+0x9c:
[  75.6599981] cpu0: End traceback...


3) server: EFI boot of Feb 6 2021, xenkernel413-4.13.3.tgz, serial console

On serial console, all that is seen is:

2415648+1324000=0x3910ec 
Loading /var/db/entropy-file
Loading /netbsd-XEN3_DOM0
Start @ 0xce60 [1=0xce991000-0xce9910ec]... 
Trampoline space cannot be allocated; will try fallback.

then it reboots