Re: Panic captures of VM

2019-10-16 Thread Mischa Peters



> On 16 Oct 2019, at 21:35, Mike Larkin  wrote:
> 
> On Wed, Oct 16, 2019 at 06:14:55PM +0200, Mischa wrote:
>> Hi Stuart,
>> 
>> 
 On 16 Oct 2019, at 18:07, Stuart Henderson  wrote:
>>> 
>>> On 2019/10/16 18:00, Mischa wrote:
 Hi All,
 
 One of the OpenBSD VMs running on 6.6-beta #313 is rebooting or panicing 
 in different ways.
 Not sure if they are all relevant or useful but here are the ones we 
 managed to capture.
>>> 
>>> There's not a lot of information in your mail... for starters, what are
>>> you running the VM in, and is there any difference in the config for that
>>> VM compared to other working ones?
>> 
>> Fair point.
>> 
>> There are 10 VMs running on this host, the host is running:
>> $ sysctl kern.version
>> kern.version=OpenBSD 6.6-beta (GENERIC.MP) #313: Tue Sep 10 23:30:52 MDT 2019
>>dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> 
>> I know of one other VM which is rebooting every once in a while, but haven’t 
>> seen any panics.
>> As for the other 8 I can see every once in a while a VM shutdown. But no 
>> capture of the console.
>> 
>>> Do you have any other VMs running the same OpenBSD snapshot successfully?
>> 
>> The rest of the VMs are on -stable as far as I am aware. Other people are 
>> operating these VMS.
>> 
>>> Can you boot an old kernel and get a dmesg?
>> 
>> Here is a dmesg which we manage to capture after one of the panics:
>> 
> 
> Are you in swap at all on that host?

Yes. :/

load averages:  0.02,  0.06,  0.13 server1.openbsd.amsterdam 
21:41:11
71 processes: 70 idle, 1 on processor   up 34 days,  
2:15
CPU0:  0.7% user,  0.0% nice,  2.6% sys,  0.5% spin,  0.1% intr, 96.2% idle
CPU1:  0.7% user,  0.0% nice,  3.4% sys,  0.4% spin,  0.0% intr, 95.6% idle
CPU2:  6.2% user,  0.0% nice, 31.0% sys, 11.0% spin,  0.0% intr, 51.8% idle
CPU3:  0.7% user,  0.0% nice,  2.8% sys,  0.3% spin,  0.0% intr, 96.1% idle
Memory: Real: 5427M/7623M act/tot Free: 275M Cache: 2001M Swap: 41M/8405M

Mischa


> 
> -ml
> 
>> fd0# panic: mtx 0x81f353f0: locking against myself
>> Using drive 0, partition 3.
>> Loading..
>> probing: pc0 com0 mem[638K 510M a20=on]
>> disk: hd0+ hd1+
 OpenBSD/amd64 BOOT 3.45
>> /
>> com0: 115200 baud
>> switching console to com0
 OpenBSD/amd64 BOOT 3.45
>> boot>
>> booting hd0a:/bsd: 12666184+2937864+332896+0+704512 
>> [987630+128+1010256+738953]=0x127d750
>> entry point at 0x81001000
>> [ using 2738000 bytes of bsd ELF symbol table ]
>> Copyright (c) 1982, 1986, 1989, 1991, 1993
>>The Regents of the University of California.  All rights reserved.
>> Copyright (c) 1995-2019 OpenBSD. All rights reserved.  
>> https://www.OpenBSD.org
>> 
>> OpenBSD 6.6 (GENERIC) #325: Wed Oct  2 11:38:13 MDT 2019
>>dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
>> real mem = 520077312 (495MB)
>> avail mem = 491753472 (468MB)
>> mpath0 at root
>> scsibus0 at mpath0: 256 targets
>> mainbus0 at root
>> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf3f40 (10 entries)
>> bios0: vendor SeaBIOS version "1.11.0p2-OpenBSD-vmm" date 01/01/2011
>> bios0: OpenBSD VMM
>> acpi at bios0 not configured
>> cpu0 at mainbus0: (uniprocessor)
>> cpu0: Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz, 3101.63 MHz, 06-3a-09
>> cpu0: 
>> FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,LONG,LAHF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,MELTDOWN
>> cpu0: 256KB 64b/line 8-way L2 cache
>> tsc_timecounter_init: TSC skew=0 observed drift=0
>> cpu0: smt 0, core 0, package 0
>> cpu0: using VERW MDS workaround
>> pvbus0 at mainbus0: OpenBSD
>> pvclock0 at pvbus0
>> pci0 at mainbus0 bus 0
>> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
>> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
>> viornd0 at virtio0
>> virtio0: irq 3
>> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00
>> vio0 at virtio1: address fe:e1:bb:d1:24:36
>> virtio1: irq 5
>> virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00
>> vioblk0 at virtio2
>> scsibus1 at vioblk0: 2 targets
>> sd0 at scsibus1 targ 0 lun 0: 
>> sd0: 51200MB, 512 bytes/sector, 104857600 sectors
>> virtio2: irq 6
>> virtio3 at pci0 dev 4 function 0 "Qumranet Virtio Storage" rev 0x00
>> vioblk1 at virtio3
>> scsibus2 at vioblk1: 2 targets
>> sd1 at scsibus2 targ 0 lun 0: 
>> sd1: 51200MB, 512 bytes/sector, 104857600 sectors
>> virtio3: irq 7
>> virtio4 at pci0 dev 5 function 0 "OpenBSD VMM Control" rev 0x00
>> vmmci0 at virtio4
>> virtio4: irq 9
>> isa0 at mainbus0
>> isadma0 at isa0
>> com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo
>> com0: console
>> vscsi0 at root
>> scsibus3 at vscsi0: 256 targets
>> softraid0 at root
>> scsibus4 at softraid0: 256 targets
>> root on sd0a (d4c2875ac610c324.a) swap on sd0b dump on sd0b
>> WARNING: / was not properly unmounted
>> Automatic 

Re: em0 stops working after random amount of time in OpenBSD 5.8/5.9

2016-04-02 Thread Mischa Peters
Hi Evgeniy,

One of the questions I had was indeed how to troubleshoot this. Nothing is in 
dmesg or messages that is out of the ordinary, I can not find anything that 
changes on the interface or netstat.

Until the 18th of March this machine was running FreeBSD, without any issues. I 
moved from 9.3-RELEASE-pXX to OpenBSD 5.8. There are still 2 machines of the 
same type that are running FreeBSD 9.3 without any issues. 

I do know there are issues in FreeBSD 10 with this NIC which haven't been 
resolved. But they have primarily to do that the driver is not loading.

The thing that is strange is that it works after reboot, I can ping an IP. But 
as soon as I run ftp or pkg_add for example, it stops working. 

Mischa

--



--
> On 02 Apr 2016, at 21:15, Evgeniy Sudyr  wrote:
> 
> Mischa,
> 
> 1) Consider using sendbug (1) to provide report (read section saying
> "The following items should be contained in every bug report")
> 
> http://www.openbsd.org/report.html
> 
> 2) I suggest to provide more details about your system configuration.
> Most interesting is if any sysctl tuning done and if it was working
> system or new/fresh setup which never worked before?
> 
> 3) Can it be some broken hardware? I just googled for your board / NIC
> and both are about 9yrs old.
> 
> --
> Evgeniy
> 
>> On Sat, Apr 2, 2016 at 7:36 PM, Mischa  wrote:
>> Hi All,
>> 
>> I just tried with: OpenBSD host 5.9 GENERIC.MP#1888 amd64
>> The result is still the same. Networking stops and sometimes continues after 
>> some time.
>> Could this because of SMP networking?
>> 
>> What I am seeing on the switch is that the MAC address is still in the MAC 
>> table.
>> But there is no longer an ARP entry.
>> 
>> Mischa
>> 
>> 
>>> On 22 Mar 2016, at 12:18, Mischa  wrote:
>>> 
>>> Hi All,
>>> 
>>> I would be happy to provide remote console access if that helps.
>>> 
>>> Mischa
>>> 
 On 20 Mar 2016, at 14:52, Mischa  wrote:
 
 Hi All,
 
 I am running OpenBSD 5.8, and tried 5.9 as well, on a SuperMicro PDSMi 
 which has an Intel 82573E.
 For some reason networking just stops working after a random amount of 
 time and usually happens when I SSH-ed into the machine.
 When connected to the console it seems to be working longer. I am testing 
 this by pinging an IP address on the local subnet.
 
 Unfortunately I can not find anything different from an interface 
 perspective, subnet perspective and nothing appears in the logs.
 The problem goes away, temporarily, when I bounce the interface on the 
 switch.
 
 How can I best troubleshoot the cause?
 
 # dmesg
 em0 at pci4 dev 0 function 0 "Intel 82573E" rev 0x03: msi, address 
 00:30:48:96:42:06
 
 # pcidump -v
 13:0:0: Intel 82573E
0x: Vendor ID: 8086 Product ID: 108c
0x0004: Command: 0107 Status: 0010
0x0008: Class: 02 Subclass: 00 Interface: 00 Revision: 03
0x000c: BIST: 00 Header Type: 00 Latency Timer: 00 Cache Line Size: 10
0x0010: BAR mem 32bit addr: 0xe8a0/0x0002
0x0014: BAR empty ()
0x0018: BAR io addr: 0x5000/0x0020
0x001c: BAR empty ()
0x0020: BAR empty ()
0x0024: BAR empty ()
0x0028: Cardbus CIS: 
0x002c: Subsystem Vendor ID: 15d9 Product ID: 108c
0x0030: Expansion ROM Base Address: 
0x0038: 
0x003c: Interrupt Pin: 01 Line: 0b Min Gnt: 00 Max Lat: 00
0x00c8: Capability 0x01: Power Management
0x00d0: Capability 0x05: Message Signaled Interrupts (MSI)
0x00e0: Capability 0x10: PCI Express
Link Speed: 2.5 / 2.5 GT/s Link Width: x1 / x1
 
 # ifconfig em0
 em0: flags=8843 mtu 1500
 lladdr 00:30:48:96:42:06
 priority: 0
 groups: egress
 media: Ethernet autoselect (1000baseT 
 full-duplex,master,rxpause,txpause)
 status: active
 inet  netmask 0xff00 broadcast 46.23.86.255
 
 # netstat -nr
 Internet:
 DestinationGatewayFlags   Refs  Use   Mtu  Prio 
 Iface
 default UGS3   37 - 8 em0
 /2446.23.86.132   UC 10 - 8 em0
  02:e0:52:9c:3c:56  UHLc   10 - 8 em0
00:30:48:96:42:06  HLl00 - 1 lo0
   UHb00 - 1 em0
 127/8  127.0.0.1  UGRS   00 32768 8 lo0
 127.0.0.1  127.0.0.1  UHl10 32768 1 lo0
 224/4  127.0.0.1  URS00 32768 8 lo0
 
 Thanx!
 
 Mischa
> 
> 
> 
> -- 
> --
> With regards,
> Eugene Sudyr