Re: HEADS UP: Merging drm update
Hi, On 2021/12/21 8:28, Taylor R Campbell wrote: Date: Mon, 20 Dec 2021 11:35:45 +0900 From: Kengo NAKAHARA GENERIC kernel without DIAGNOSTIC option fails to build. Could you apply the following patch? https://github.com/knakahara/netbsd-src/commit/b1c93870ef5689201b6eb7e08811bc40e3e1543e Thanks! I did it slightly differently, to reduce the amount of upstream patching we need to do. OK now? LGTM. Thank you for your quick response! Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Product Division, Technology Unit Kengo NAKAHARA
Re: HEADS UP: Merging drm update
i915_scheduler.c @@ -55,7 +55,7 @@ static inline struct i915_priolist *to_priolist(struct rb_node *rb) static void assert_priolists(struct intel_engine_execlists * const execlists) { struct rb_node *rb; - long last_prio, i; + long last_prio __diagused, i; if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) return; diff --git a/sys/external/bsd/drm2/i915drm/i915_pci_autoconf.c b/sys/external/bsd/drm2/i915drm/i915_pci_autoconf.c index 3d0bc912ef4..9a8fd33809b 100644 --- a/sys/external/bsd/drm2/i915drm/i915_pci_autoconf.c +++ b/sys/external/bsd/drm2/i915drm/i915_pci_autoconf.c @@ -171,7 +171,7 @@ i915drmkms_attach_real(device_t self) struct i915drmkms_softc *const sc = device_private(self); struct pci_attach_args *const pa = &sc->sc_pa; const struct pci_device_id *ent = i915drmkms_pci_lookup(pa); - const struct intel_device_info *const info = (struct intel_device_info *) ent->driver_data; + const struct intel_device_info *const info __diagused = (struct intel_device_info *) ent->driver_data; int error; KASSERT(info != NULL); Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Product Division, Technology Unit Kengo NAKAHARA
Re: x86 in-kernel fpu bug
Hi, On 2020/07/21 1:49, Taylor R Campbell wrote: Date: Mon, 20 Jul 2020 11:04:21 +0100 From: Patrick Welche After a -current/amd64 update, a sudden outbreak of floating point exceptions: /usr/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/tree-ssa-operands.c:1348:1: internal compiler error: Floating point exception Fetching message headers... Floating point exception (core dumped) mutt Any guesses? There's a good chance this has been fixed in sys/arch/x86/x86/fpu.c revision 1.72 -- can you update and try again with a new kernel? This commit fixes Floating point exception of build.sh release on my machine which uses AES ipsecif(4). The CPU is AMD A4-5000 which has AES-NI. Thank you for your fix. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Product Development Department, Product Division, Technology Unit Kengo NAKAHARA
Re: wm0 panic
Hi, On 2020/06/28 0:24, Patrick Welche wrote: Trying a today's -current/amd64 with DIAGNOSTIC/DEBUG/LOCKDEBUG, I can boot multiuser without a network. If I log in as root, as soon as I hit enter: # ifconfig wm0 inet 10.0.0.62 netmask 0xff00 [ 127.5763268] Kernel lock error 127.5763268] lock address : 0x8106ab40 type : spin [ 127.5863237] initialized : 0x80b0bbb9 [ 127.5863237] shared holds : 0 exclusive: 1 [ 127.5963238] shares wanted: 0 exclusive: 1 [ 127.6063236] relevant cpu : 1 last held: 0 [ 127.6163235] relevant lwp : 0x8d419a07f20 [ 127.6163235] last locked* : 0x80a7d2f5 unlocked : 0x80a7d2e6 [ 127.6263235] curcpu holds : 0 wanted by: 0x8d419a07f200 [ 127.6363234] panic: LOCKDEBock,244: spinout [ 127.6363234] cpu1: Begin traceback... [ 127.6463233] vpanic() at netbsd:vpanic+0x152 [ 127.6463233] snprintf() at netbsd:snprintf [ 127.6563232] lockdebug_more() at netbsd:lockdebug_more [ 127.6563232] _kernel_lock() at netbsd:_kernel_lock+0x244 [ 127.6663231] ip_slowtimo() at netbsd:ip_slowtimo+0x1a [ 127.6763231] pfslowtimo() at netbsd:pfslowtimo+0x34 [ 127.6763231] callout_softclock() at netbsd:callout_softclock+0x10f [ 127.6863230] softint_disph+0x108 [ 127.6863230] DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xa4825d02eff0 [ 127.6963230] Xsoftintr() at netbsd:Xsoftintr+0x4f [ 127.7063229] --- interrupt --- [ 127.706322traceback... It seems some other code have held KERNEL_LOCK too long time. Could you show the function of last locked address? # e.g. addr2line -e "your kernel image" -f 0x80a7d2f5 If the panic can reappear, could you show "show all locks/t" of ddb? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Product Development Department, Product Division, Technology Unit Kengo NAKAHARA
Re: MSI/MSI-X implementation and interrupt handling on i386/amd64
Hi, On 2018/12/11 6:49, Jaromír Doleček wrote: > Le jeu. 6 déc. 2018 à 16:05, Cherry G.Mathew a écrit : >> The right thing to do is to stop using a bit mask entirely, and using >> a bit more scalable Data structure for this. This isn't trivial though - >> the assembler stuff will be harder to maintain correctness than a >> straightup buslocked bitscan/compare etc. > > What about just bumping this to 64 on amd64, where we have the 64-bit > atomic ops? While keeping i386 still on 32. > > We seem to have already i386 and amd64 variants of the interrupt > assembler, so maybe not so bad that they would diverge further. > > It would be nice to do something to bump the limit. If we have general > consensus is that this is worth doing, I can try to write something > and see how ugly/difficult it would become with 64bit bitmasks. I > don't feel like delving into rewriting this to use completely > different structure ... I agree it. I mention some old Athlon 64 series (before socket AM2) do not support cmpxchg16b instruction. That would affect rewriting spllower to support 64 bit interrupt bitmask. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: panic by "acpidump -dt" on -current(Nov 22)
Hi, On 2017/11/27 22:15, Christos Zoulas wrote: > In article > , > Chavdar Ivanov wrote: >> -=-=-=-=-=- >> >> And also I get the same panic with 'acpidump -dt', tested with the >> yesterday's image from releng - >> http://nycdn.netbsd.org/pub/NetBSD-daily/HEAD/201711212310Z/images/NetBSD-8.99.7-amd64.iso >> . >> > > This is fixed with: > > /* $NetBSD: pmap.c,v 1.267 2017/11/22 21:26:01 christos Exp $ */ It works fine for me. Thank you. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
panic by "acpidump -dt" on -current(Nov 22)
Hi, I met a panic when I do "acpidump -dt" on -current(Nov 22). The specific version is the following commit. https://mail-index.netbsd.org/source-changes/2017/11/21/msg089839.html # It includes chs@n.o's fix of pmap_enter_ma() and uvm_fault_upper_enter(). Here is the panic message and backtrace. panic: prevented access to 0x10 (SMAP) fatal breakpoint trap in supervisor mode trap type 1 code 0 rip 0x8021d2d5 cs 0x8 rflags 0x246 cr2 0x10 ilevel 0 rsp 0xe40110c4b7e0 curlwp 0xe4027b7cd220 pid 866.1 lowest kstack 0xe40110c482c0 Stopped in pid 866.1 (acpidump) at netbsd:breakpoint+0x5: leave db{6}> bt breakpoint() at netbsd:breakpoint+0x5 vpanic() at netbsd:vpanic+0x140 panic() at netbsd:panic+0x3c trap() at netbsd:trap+0xbf0 --- trap (number 6) --- pmap_enter_ma() at netbsd:pmap_enter_ma+0xe2a pmap_enter_default() at netbsd:pmap_enter_default+0x1d udv_fault() at netbsd:udv_fault+0x151 uvm_fault_internal() at netbsd:uvm_fault_internal+0x6d4 trap() at netbsd:trap+0x3f0 --- trap (number 6) --- 1568033cf: Here is dmesg. http://netbsd.org/~knakahara/20171122-acpidump-panic-dmesg msaitoh@n.o met the same panic on other machine. Does anyone meet this issue? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Possible regression in wm(4)?
Hi, Sorry, one more thing. On 2017/11/15 11:31, Kengo NAKAHARA wrote: > On 2017/11/14 21:53, Bert Kiers wrote: >> On Tue, Nov 14, 2017 at 08:07:46PM +0900, Kengo NAKAHARA wrote: >>> I'm sorry I cannot solve it... >>> Hmm, now I think this problem may relate to MSI/MSI-X interrupts >>> setting about ioapic. If it is not a problem, could you try the >>> following patch? >>> I believe this patch let wm(4) do the same behavior as NetBSD-7, >>> that is, wm(4) uses INTx interrupt instead of MSI/MSI-X interrupt. >>> >>> >>> --- a/sys/dev/pci/if_wm.c >>> +++ b/sys/dev/pci/if_wm.c >>> @@ -174,10 +174,10 @@ int wm_debug = WM_DEBUG_TX | WM_DEBUG_RX | >>> WM_DEBUG_LINK | WM_DEBUG_GMII >>> #define WM_MAX_NINTR (WM_MAX_NQUEUEINTR + 1) >>> >>> #ifndef WM_DISABLE_MSI >>> -#defineWM_DISABLE_MSI 0 >>> +#defineWM_DISABLE_MSI 1 >>> #endif >>> #ifndef WM_DISABLE_MSIX >>> -#defineWM_DISABLE_MSIX 0 >>> +#defineWM_DISABLE_MSIX 1 >>> #endif >>> >>> int wm_disable_msi = WM_DISABLE_MSI; >>> >> >> That still does not work. The NIC probes as >> >> yvresse# grep ^wm1 dmesg.netbsd8wmfixC >> wm1 at pci1 dev 0 function 1: 82576 1000BaseT Ethernet (rev. 0x01) >> wm1: interrupting at ioapic1 pin 16 >> wm1: PCI-Express bus >> wm1: 16384 words (16 address bits) SPI EEPROM, version 1.43, Image Unique ID >> e606 >> wm1: Ethernet address 00:30:48:9e:a9:2f >> wm1: Copper >> wm1: 0x74440 >> >> But still no traffic. > > Oh, the dmesg is as expected, but the behavior is not. > Hmm, sorry, could you give me the following information? > + "intrctl list" result on NetBSD-8 > - before trying traffic and after it > - full dmesg on NetBSD-8 which boot with "-xv" option > - full dmesg on NetBSD-7 (which boot -xv if you can) > - "acpidump -dt" result And - "ifconfig -v wm0" and "ifconfig -v wm1" result >> I have two wm(4) cards with different chips. I'll put them in the >> machine tomorrow and see what happens. > > I expect it will work. > >> Btw, this is the only device that tries to use MSI(X) in this box. >> >> Two devices use ioapic pin 16: >> >> wm1: interrupting at ioapic1 pin 16 >> uhci0: interrupting at ioapic0 pin 16 >> >> and systat vmstat shows between 0 and 10 interrupts per second there on >> that pin > > Yes, they use pin 16, but wm1 use ioapic1 whereas uhci0 use ioapic0. > Which ioapic's pin 16 interrupts occur? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Possible regression in wm(4)?
Hi, On 2017/11/14 21:53, Bert Kiers wrote: > On Tue, Nov 14, 2017 at 08:07:46PM +0900, Kengo NAKAHARA wrote: >> I'm sorry I cannot solve it... >> Hmm, now I think this problem may relate to MSI/MSI-X interrupts >> setting about ioapic. If it is not a problem, could you try the >> following patch? >> I believe this patch let wm(4) do the same behavior as NetBSD-7, >> that is, wm(4) uses INTx interrupt instead of MSI/MSI-X interrupt. >> >> >> --- a/sys/dev/pci/if_wm.c >> +++ b/sys/dev/pci/if_wm.c >> @@ -174,10 +174,10 @@ intwm_debug = WM_DEBUG_TX | WM_DEBUG_RX | >> WM_DEBUG_LINK | WM_DEBUG_GMII >> #define WM_MAX_NINTR(WM_MAX_NQUEUEINTR + 1) >> >> #ifndef WM_DISABLE_MSI >> -#define WM_DISABLE_MSI 0 >> +#define WM_DISABLE_MSI 1 >> #endif >> #ifndef WM_DISABLE_MSIX >> -#define WM_DISABLE_MSIX 0 >> +#define WM_DISABLE_MSIX 1 >> #endif >> >> int wm_disable_msi = WM_DISABLE_MSI; >> > > That still does not work. The NIC probes as > > yvresse# grep ^wm1 dmesg.netbsd8wmfixC > wm1 at pci1 dev 0 function 1: 82576 1000BaseT Ethernet (rev. 0x01) > wm1: interrupting at ioapic1 pin 16 > wm1: PCI-Express bus > wm1: 16384 words (16 address bits) SPI EEPROM, version 1.43, Image Unique ID > e606 > wm1: Ethernet address 00:30:48:9e:a9:2f > wm1: Copper > wm1: 0x74440 > > But still no traffic. Oh, the dmesg is as expected, but the behavior is not. Hmm, sorry, could you give me the following information? + "intrctl list" result on NetBSD-8 - before trying traffic and after it - full dmesg on NetBSD-8 which boot with "-xv" option - full dmesg on NetBSD-7 (which boot -xv if you can) - "acpidump -dt" result > I have two wm(4) cards with different chips. I'll put them in the > machine tomorrow and see what happens. I expect it will work. > Btw, this is the only device that tries to use MSI(X) in this box. > > Two devices use ioapic pin 16: > > wm1: interrupting at ioapic1 pin 16 > uhci0: interrupting at ioapic0 pin 16 > > and systat vmstat shows between 0 and 10 interrupts per second there on > that pin Yes, they use pin 16, but wm1 use ioapic1 whereas uhci0 use ioapic0. Which ioapic's pin 16 interrupts occur? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Possible regression in wm(4)?
Hi, On 2017/11/14 19:33, Bert Kiers wrote: > On Tue, Nov 14, 2017 at 12:34:40PM +0900, Kengo NAKAHARA wrote: > > I am sorry to have to say they both do not fix the problem. > >> == (A) == >> --- a/sys/dev/pci/if_wm.c >> +++ b/sys/dev/pci/if_wm.c >> @@ -4883,8 +4883,8 @@ wm_adjust_qnum(struct wm_softc *sc, int nvectors) >> hw_nrxqueues = 4; >> break; >> case WM_T_82576: >> -hw_ntxqueues = 16; >> -hw_nrxqueues = 16; >> +hw_ntxqueues = 1; >> +hw_nrxqueues = 1; >> break; >> case WM_T_82580: >> case WM_T_I350: >> == (A) == > > With this patch, it probes as > > yvresse# cat dmesg.netbsd8wmfixA|grep wm1 > wm1 at pci1 dev 0 function 1: 82576 1000BaseT Ethernet (rev. 0x01) > wm1: for TX and RX interrupting at msix1 vec 0 affinity to 1 > wm1: for LINK interrupting at msix1 vec 1 > wm1: PCI-Express bus > wm1: 16384 words (16 address bits) SPI EEPROM, version 1.43, Image Unique ID > e606 > wm1: Ethernet address 00:30:48:9e:a9:2f > wm1: Copper > wm1: 0x74440 > igphy1 at wm1 phy 1: i82566 10/100/1000 media interface, rev. 1 > > but there is no incoming traffic > >> == (B) == >> --- a/sys/dev/pci/if_wm.c >> +++ b/sys/dev/pci/if_wm.c >> @@ -177,7 +177,7 @@ int wm_debug = WM_DEBUG_TX | WM_DEBUG_RX | >> WM_DEBUG_LINK | WM_DEBUG_GMII >> #define WM_DISABLE_MSI 0 >> #endif >> #ifndef WM_DISABLE_MSIX >> -#define WM_DISABLE_MSIX 0 >> +#define WM_DISABLE_MSIX 1 >> #endif >> >> int wm_disable_msi = WM_DISABLE_MSI; >> == (B) == > > With this one, > > yvresse# cat dmesg.netbsd8wmfixB|grep wm1 > wm1 at pci1 dev 0 function 1: 82576 1000BaseT Ethernet (rev. 0x01) > wm1: interrupting at msi1 vec 0 > wm1: PCI-Express bus > wm1: 16384 words (16 address bits) SPI EEPROM, version 1.43, Image Unique ID > e606 > wm1: Ethernet address 00:30:48:9e:a9:2f > wm1: Copper > wm1: 0x74440 > igphy1 at wm1 phy 1: i82566 10/100/1000 media interface, rev. 1 > > and also no traffic I'm sorry I cannot solve it... Hmm, now I think this problem may relate to MSI/MSI-X interrupts setting about ioapic. If it is not a problem, could you try the following patch? I believe this patch let wm(4) do the same behavior as NetBSD-7, that is, wm(4) uses INTx interrupt instead of MSI/MSI-X interrupt. --- a/sys/dev/pci/if_wm.c +++ b/sys/dev/pci/if_wm.c @@ -174,10 +174,10 @@ int wm_debug = WM_DEBUG_TX | WM_DEBUG_RX | WM_DEBUG_LINK | WM_DEBUG_GMII #define WM_MAX_NINTR (WM_MAX_NQUEUEINTR + 1) #ifndef WM_DISABLE_MSI -#defineWM_DISABLE_MSI 0 +#defineWM_DISABLE_MSI 1 #endif #ifndef WM_DISABLE_MSIX -#defineWM_DISABLE_MSIX 0 +#defineWM_DISABLE_MSIX 1 #endif int wm_disable_msi = WM_DISABLE_MSI; Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Possible regression in wm(4)?
Hi, On 2017/11/11 7:23, Bert Kiers wrote: > On Fri, Nov 10, 2017 at 08:23:21PM +0100, Jimmy Johansson wrote: > > Hi, > >> Has anybody else had issues with these interfaces in NetBSD 8 or NetBSD >> current? > > yes, see kern/52717: no wm(4) networking in 8.0_BETA > also with i82576 Hmm, it seems problems related interrupts on dual socket system. Could you try the following patch(A)? If it does not work, could you try patch(B)? Patch (A) uses only two MSI-X vectors, so the all interrupts may be affinity to socket0. That can avoid the problems related dual socket system. Patch (B) uses not MSI-X but MSI. That means it almost the same behavior as NetBSD-7 or older. == (A) == --- a/sys/dev/pci/if_wm.c +++ b/sys/dev/pci/if_wm.c @@ -4883,8 +4883,8 @@ wm_adjust_qnum(struct wm_softc *sc, int nvectors) hw_nrxqueues = 4; break; case WM_T_82576: - hw_ntxqueues = 16; - hw_nrxqueues = 16; + hw_ntxqueues = 1; + hw_nrxqueues = 1; break; case WM_T_82580: case WM_T_I350: == (A) == == (B) == --- a/sys/dev/pci/if_wm.c +++ b/sys/dev/pci/if_wm.c @@ -177,7 +177,7 @@ int wm_debug = WM_DEBUG_TX | WM_DEBUG_RX | WM_DEBUG_LINK | WM_DEBUG_GMII #defineWM_DISABLE_MSI 0 #endif #ifndef WM_DISABLE_MSIX -#defineWM_DISABLE_MSIX 0 +#defineWM_DISABLE_MSIX 1 #endif int wm_disable_msi = WM_DISABLE_MSI; == (B) == Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Crash related to VLANs in Oct 18th -current
Hi, On 2017/10/24 8:23, Roy Marples wrote: > On 23/10/2017 12:18, Roy Marples wrote: >> On 23/10/2017 11:28, Tom Ivar Helbekkmo wrote: >>> Has something changed that makes dhcpcd now insist on listening to all >>> interfaces (including the 802.1q trunk)? >> >> Yes. >> I will try and improve the logic so it's only the relevant interfaces. >> The change was made to allow IP address sharing on many interfaces via >> DHCP without actually removing the IP address from the non active >> interfaces. >> This might have been over-zealous on my part. >> >>> Can I make it not do that? >> >> Currently not, no. >> Hopefully I can change it so that no toggle for it is needed. > > Patch here to make it not do this anymore: > https://roy.marples.name/git/dhcpcd.git/commit/?id=c72da9a1ce60d006136c5aa3e1c923d96761a171 > > The caveat is that we now need to ARP announce the address during reboot > to ensure dhcpcd gets the reply on an active interface. > > Let me know how it works for you. Thank you very much for your help! I will commit the wm(4) vlan fix patch. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Crash related to VLANs in Oct 18th -current
Hi, On 2017/10/22 23:56, Tom Ivar Helbekkmo wrote: > Tom Ivar Helbekkmo writes: > >> That did the trick! Thank you! :) Thank you for your testing! > I'm actually wondering if there may be something else strange going on. > Everything works fine -- but I have this dhcpcd running, because one of > my VLANs is connected to a network where this machine has to accept a > DHCP provisioned IP address from a server. I run "dhcpcd -q vlan9", and > also give it a configuration file that should keep it from doing > anything I don't want: > > allowinterfaces vlan9 > interface vlan9 > background > persistent > hostname_short > nogateway > nohook resolv.conf, wpa_supplicant, hostname, ntp.conf > script /usr/bin/true > > However, after this last upgrade, I keep getting messages from dhcpcd > about other interfaces, where this host is the DHCP server, like: > > Oct 22 16:48:28 barsoom dhcpcd[16236]: vlan2: invalid UDP packet from > 172.27.201.1 > Oct 22 16:48:28 barsoom dhcpcd[16236]: wm0: invalid UDP packet from > 172.27.201.1 > > This happens every time a host on one of the other VLANs gets an address > from the local DHCP server, and I get this pair of messages; one for the > VLAN in question, one for wm0, which is the vlanif with the trunk on it. > > Running 8.99.1 from about two months ago, these messages did not occur. Hmm..., sorry, I am not sure about this problem from that information. Could you get tcpdump? Of course, if it is not a problem, please do it. > roy@n.o I think the issue seems to be related to DHCP. Could you think of any other way to solve it? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: Crash related to VLANs in Oct 18th -current
Hi, On 2017/10/20 2:30, Tom Ivar Helbekkmo wrote: > I just updated to a fresh -current yesterday, and am running it on a > couple of amd64 systems. It crashes during boot on the third one, > though, the one that has VLANs. > > It configures wm0 thus: > > # cat ifconfig.wm0 > up > media 100baseTX mediaopt full-duplex > ip4csum tcp4csum udp4csum > > ...and then goes on to create a number of VLANs, by this pattern: > > # cat ifconfig.vlan0 > create > vlan 10 vlanif wm0 > ip4csum tcp4csum udp4csum > inet 193.71.27.8 prefixlen 27 > inet6 2001:8c0:c904:10::8 prefixlen 64 > > ...and so on. I set up five of those VLANs, and a split second later > (copied by hand from a photograph of a console terminal, as for some > reason I didn't get a valid crash dump) (the first line is truncated): > > panic: kernel diagnostic assertion "(vlanid & ~ETHER_VLAN_MASK) == 0" failed: > f > cpu0: Begin traceback... > vpanic() at netbsd:vpanic+0x140 > ch_voltag_convert_in() at netbsd:ch_voltag_convert_in > wm_rxeof() at netbsd:wm_rxeof+0x88f > wm_intr_legacy() at netbsd:wm_intr_legacy+0xa1 > intr_biglock_wrapper() [...] > > The KASSERT is in the vlan_set_tag() function in sys/net/if_ether.h. // snip Thank you for your detailed reporting, it helps to find out the cause very much. Could you try the following patch? diff --git a/sys/dev/pci/if_wm.c b/sys/dev/pci/if_wm.c index 8a2feedb607..00d06331c93 100644 --- a/sys/dev/pci/if_wm.c +++ b/sys/dev/pci/if_wm.c @@ -8095,11 +8095,11 @@ wm_rxdesc_get_vlantag(struct wm_rxqueue *rxq, int idx) struct wm_softc *sc = rxq->rxq_sc; if (sc->sc_type == WM_T_82574) - return rxq->rxq_ext_descs[idx].erx_ctx.erxc_vlan; + return EXTRXC_VLAN_ID(rxq->rxq_ext_descs[idx].erx_ctx.erxc_vlan); else if ((sc->sc_flags & WM_F_NEWQUEUE) != 0) - return rxq->rxq_nq_descs[idx].nqrx_ctx.nrxc_vlan; + return NQRXC_VLAN_ID(rxq->rxq_nq_descs[idx].nqrx_ctx.nrxc_vlan); else - return rxq->rxq_descs[idx].wrx_special; + return WRX_VLAN_ID(rxq->rxq_descs[idx].wrx_special); } static inline int diff --git a/sys/dev/pci/if_wmreg.h b/sys/dev/pci/if_wmreg.h index c005414764c..97a9964b2be 100644 --- a/sys/dev/pci/if_wmreg.h +++ b/sys/dev/pci/if_wmreg.h @@ -208,6 +208,12 @@ typedef union ext_rxdesc { #define EXTRXC_STATUS_PKTTYPE_MASK __BITS(19,16) #define EXTRXC_STATUS_PKTTYPE(status) __SHIFTOUT(status,EXTRXC_STATUS_PKTTYPE_MASK) +#defineEXTRXC_VLAN_ID_MASK __BITS(11,0)/* VLAN identifier mask */ +#defineEXTRXC_VLAN_ID(x) ((x) & EXTRXC_VLAN_ID_MASK) /* VLAN identifier */ +#defineEXTRXC_VLAN_CFI __BIT(12) /* Canonical Form Indicator */ +#defineEXTRXC_VLAN_PRI_MASK__BITS(15,13) /* VLAN priority mask */ +#defineEXTRXC_VLAN_PRI(x) __SHIFTOUT((x),EXTRXC_VLAN_PRI_MASK) /* VLAN priority */ + /* advanced RX descriptor for 82575 and newer */ typedef union nq_rxdesc { struct { @@ -330,6 +336,12 @@ typedef union nq_rxdesc { #define NQRXC_STATUS_MC__BIT(19) /* Packet received from Manageability Controller */ /* "MBC" in i350 spec */ +#defineNQRXC_VLAN_ID_MASK __BITS(11,0)/* VLAN identifier mask */ +#defineNQRXC_VLAN_ID(x)((x) & NQRXC_VLAN_ID_MASK) /* VLAN identifier */ +#defineNQRXC_VLAN_CFI __BIT(12) /* Canonical Form Indicator */ +#defineNQRXC_VLAN_PRI_MASK __BITS(15,13) /* VLAN priority mask */ +#defineNQRXC_VLAN_PRI(x) __SHIFTOUT((x),NQRXC_VLAN_PRI_MASK) /* VLAN priority */ + /* * The Wiseman transmit descriptor. * Thanks, -- ////// Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: problems with vlan interface counters (NetBSD 8.0_BETA)
Hi, On 2017/08/03 18:13, s ymgch wrote: > The problem was happened in vlan mp-ify. > I fixed this problem by the following patch in my environment. > > Could you apply the patch and check it? > > Regards, > s-yamaguchi@IIJ > > patch > diff --git a/sys/net/if_vlan.c b/sys/net/if_vlan.c > index 531a2f5..a4ea6e1 100644 > --- a/sys/net/if_vlan.c > +++ b/sys/net/if_vlan.c > @@ -1451,10 +1451,13 @@ vlan_transmit(struct ifnet *ifp, struct mbuf *m) > /* mbuf is already freed */ > ifp->if_oerrors++; > } else { > + size_t pktlen = m->m_pkthdr.len; > + bool mcast = (m->m_flags & M_MCAST) != 0; > + > ifp->if_opackets++; > - /* > -* obytes is incremented at ether_output() or > bridge_enqueue(). > -*/ > + ifp->if_obytes += pktlen; > + if (mcast) > + ifp->if_omcasts++; > } > > out: > > 2017-07-28 17:10 GMT+09:00 <6b...@6bone.informatik.uni-leipzig.de>: >> Hello, >> >> The interface counters of vlan interface do not count: >> >> bash-4.4# ifconfig -v vlan8 >> vlan8: flags=0x8843 mtu 1500 >> capabilities=7ff80 >> capabilities=7ff80 >> capabilities=7ff80 >> enabled=0 >> vlan: 8 parent: ixg0 >> address: a0:36:9f:d4:3c:08 >> input: 1966263 packets, 273676300 bytes, 66058 multicasts >> output: 1238957 packets, 0 bytes >> inet6 fe80::a236:9fff:fed4:3c08%vlan8/64 flags 0x0 scopeid 0x1a >> inet6 ::::::: flags 0x0 >> >> The output byte counter shows 0. With netbsd-7 all worked fine. >> >> So it is not longer possible to record traffic data via snmp. I test s-yamaguchi's patch and minor fixes which I received locally. # He is my co-worker :) The patch fixes the problem correctly, so I commit that as if_vlan.c:r1.99. I will send pullup request to netbsd-8 branch. Could you retry after pulled up? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: 8.99.1: exiting unheld spin mutex
Hi, I'm sorry for late reply. On 2017/06/07 3:08, Christos Zoulas wrote: > In article <20170606173220.6e53czlyei7hep4h@danbala>, > Thomas Klausner wrote: >> Hi! >> >> I just upgraded from 7.99.75 to 8.99.1. The kernel didn't really come >> up (I didn't watch the console at first). It paniced with a mutex >> issue. When it came up again with a 8.99.1 kernel, it paniced while >> writing the kernel core file to disk, with something like: >> >> SPL NOT LOWERED ON SYSCALL EXIT >> >> When it came up again, it paniced during startup, a bit later, with a >> backtrace this time: >> >> Mutex error: mutex_vector_exit,720: exiting unheld spin mutex >> >> lock address : 0xfe882ef8d0b8 >> current cpu : 6 >> current lwp : 0xfe881c147a40 >> owner field : 0x0600 wait/spin: 0/1 >> >> panic: lock error: Mutex: ... (same thing) >> cpu6: begin traceback... >> vpanic9) >> snprintf() >> lockdebug_abort() >> mutex_vector_exit() >> crypto_getfeat() >> cryptof_ioctl() >> sys_ioctl() >> syscall() at netbsd:syscall+0x1d8 >> -- syscall (number 54) --- >> 72c3f3d18baa: >> cpu6: End traceback... >> >> That's with a DIAGNOSTIC kernel. >> > > I fixed it. Thank you for your fix! Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: wm devices don't work under current amd64
Hi, Sorry for the long long delay... On 2016/07/06 15:55, Masanobu SAITOH wrote: > I got a Latitude E6400 via an auction. I tried -current and it > worked with MSI. While checking your dmesg, I noticed that you > didn't use ACPI. I tried without ACPI and I could reproduce the > problem. Without ACPI, any ioapic isn't attached. knakaraha said > it might be the reason of the problem. Perhaps the problem is > not only for Latitude E6400 but for all systems which don't use > ACPI and use MSI/MSI-X. It can be fixed. I borrow Latitude E6400 from msaitoh@n.o, and test below commit. http://mail-index.netbsd.org/source-changes/2017/04/14/msg083553.html The wm works well without ACPI. # Thank you nonaka@n.o Could you try latest kernel? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: if_wm panics on boot
Hi, On 2017/03/04 4:31, Tomohiro Kusumi wrote: > Hi > > Now I'm at > $NetBSD: if_wm.c,v 1.495 2017/03/03 07:57:49 > but still hits the same assertion located at L6622. Ahh, sorry, the newest revision is not r1.495 but r1.496... r1.495 is not fixed yet. The ident of r1.496 is === +/* $NetBSD: if_wm.c,v 1.496 2017/03/03 16:48:55 knakahara Exp $*/ === and commit log is === fix r1.492 bug, sorry === Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: if_wm panics on boot
Hi, On 2017/03/04 0:29, Tomohiro Kusumi wrote: > Yes, the source says 1.492 which is very recent. > I guess I'll just revert to the one that was (safely)working for now. > > 1.492 2017/03/03 03:33:44 knakahara Sorry, it is my mistake as you point out. I fix it r1.495 just now. Could you try the newest wm? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: -current 7.99.36 multicast panic: trap
Hi, On 2016/09/05 1:00, Frank Kardel wrote: > running the -current (7.99.36) with 7.99.16 userland reliably traps at: > > src/sys/netinet/ip_mroute.c:1751 > > on an multi-homed i386 system running mrouted. > > A 7.99.16 kernel survives that fine. Do we have a userland dependency or > is this a new regression from the network stack multiprocessing changes? There is no userland dependency of ip_mroute.c 7.99.16 to 7.99.36 modifications. I think it may be new regression. Could you show below additional information? - backtrace at panic - your kernel config # GENERIC config seems to disable MROUTING by default, so I think # you would use custom kernel config. - detail reproduction code - "netstat -g" and "netstat -nr -f inet" if it is possible before panic Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: kernel panic
Hi, On 2016/06/16 8:15, bch wrote: > I am now at 1.414, and it seems stable. Thank you for your checking and reporting. If it seems there is still problems, please tell us. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: kernel panic
Hi, On 2016/06/16 1:44, bch wrote: > On 6/12/16, bch wrote: >> On 6/11/16, bch wrote: snip > And now, on wm(4): > -rwxr-xr-x 1 root wheel 18218304 Jun 14 10:20 /netbsd > > strathcona# crash -M ./netbsd.8.core /netbsd > Crash version 7.99.30, image version /amd64/compile/G. > WARNING: versions differ, you may not be able to examine this image. > System panicked: trap > Backtrace from time of crash is available. > crash> bt > _KERNEL_OPT_NARCNET() at 0 > _KERNEL_OPT_ACPI_SCANPCI() at _KERNEL_OPT_ACPI_SCANPCI+0x5 > aprint_verbose() at aprint_verbose+0x2f > aprint_naive_internal.part.0() at aprint_naive_internal.part.0+0x14 > trap() at trap+0xc4b > --- trap (number 6) --- > mutex_enter() at mutex_enter+0xc > fddi_output() at fddi_output+0x47c > wm_tick() at wm_tick+0x230 > in6_update_ifa1() at in6_update_ifa1+0x766 > in6ifa_ifpforlinklocal() at in6ifa_ifpforlinklocal+0x4a > in6_control1() at in6_control1+0x521 > in6_control() at in6_control+0x10d > udp6_connect_wrapper() at udp6_connect_wrapper+0x83 > compat_43_sa_put() at compat_43_sa_put+0x14 > if_flags_set() at if_flags_set+0xb5 > sysctl_kern_sysvipc() at sysctl_kern_sysvipc+0x37d > handle_modctl_load() at handle_modctl_load+0x108 > syscall() at syscall+0x14b > --- syscall (number 54) --- > 7f7ff74e89fa: > crash> May your if_wm.c ident be r1.413? If so, could you try r1.414? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: systems with -current wm(4) hang?
Hi, On 2016/06/08 16:58, John D. Baker wrote: > On Tue, 7 Jun 2016, Kengo NAKAHARA wrote: > >> Thank you for rechecking. I'm glad the revision fixes almost >> system problem. > >>> Dell PowerEdge SC430, PCI/PCIe busses: >>> >>> PCI-Express add-in card: >>> wm0 at pci2 dev 0 function 0: Intel PRO/1000 PT (82571EB) (rev. 0x06) >>> wm1 at pci2 dev 0 function 1: Intel PRO/1000 PT (82571EB) (rev. 0x06) >>> >>> Interface "wm1" is not bootable and therefore was not tested. >> >> Hmm, I have no idea to fix this problem... > > It's not a problem with the driver. The card has two MACs in a single > package and the PXE bootrom on the card allows booting from the first > interface only. As the test system was running (semi-)diskless, I could > only test using wm0. Since I had no problem with wm0, I would expect > not to have problems with wm1 either. I see. Thank you for telling me. > I arranged a kernel config to force root on wm1 ("boot -a" can't be used > as the USB keyboard doesn't work at the "root device:" prompt). I boot > over wm0 or the machine's built-in bge0 and switch the cable to wm1 > while the kernel is initializing. > > Stress testing building packages has proceeded without incident. Thank you for rechecking. I am relieved to hear it. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: systems with -current wm(4) hang?
Hi, I'm sorry that I forgot to let you know the fix. On 2016/06/07 3:33, John D. Baker wrote: > Now using if_wm.c r1.411. Thank you for rechecking. I'm glad the revision fixes almost system problem. > The amd64-class systems I've tested so far: > snip > Dell PowerEdge SC430, PCI/PCIe busses: > > PCI-Express add-in card: > wm0 at pci2 dev 0 function 0: Intel PRO/1000 PT (82571EB) (rev. 0x06) > wm1 at pci2 dev 0 function 1: Intel PRO/1000 PT (82571EB) (rev. 0x06) > > Interface "wm0" did not exhibit problems before. Rechecking to see if > there were any regressions. Stress testing building packages proceeded > without incident. > > Interface "wm1" is not bootable and therefore was not tested. Hmm, I have no idea to fix this problem... Thank you very much for your detailed survey! Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: if_wm.c 1.410 sometimes hangs / sndq drops
Hi, On 2016/05/30 0:16, Michael van Elst wrote: > kar...@netbsd.org (Frank Kardel) writes: > >> With -current as of 20160526T13Z and if_wm.c 1.410 >> a stuck interface is abserved on following hardware > >> wm1 at pci11 dev 0 function 0: Intel i82583V (rev. 0x00) >> wm1: interrupting at ioapic0 pin 18 >> wm1: PCI-Express bus >> wm1: 2048 words FLASH, version 1.10.0, Image Unique ID >> wm1: Ethernet address bc:5f:f4:98:32:84 >> makphy0 at wm1 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1 > >> when rsyncing. The interface is on the sending side and >> the sndq drops increase dramatically: > >> net.interfaces.wm1.rcvq.drops = 0 >> net.interfaces.wm1.sndq.len = 0 >> net.interfaces.wm1.sndq.maxlen = 256 >> net.interfaces.wm1.sndq.drops = 6077 > >> ifconfig down/up recover the interface. > > > Same here. I'm not sure if that is specific to wm but: > > if_wm.c 1.400 seems to have the problem. > if_wm.c 1.391 does not. > > N.B. the current version 1.410 also has some other issue as it > causes dhcpcd to behave differently. Probably something > with carrier detection. This is something not visible already > with 1.400. I think my mistake in if_wm.c:r1.409 causes this problem. I think it is fixed in if_wm.c:r1.411. Could you try the revision? I'm sorry that I don't have ethernet controllers earlier than 82575 such as 82583. I will borrow them from msaitoh@n.o to test... Thanks, -- ////// Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: systems with -current wm(4) hang?
Hi, Thank you very mach for very detailed and lucid survey! On 2016/05/27 0:23, John D. Baker wrote: > Using "if_wm.c" r1.407, > > A ThinkPad T42 with: > > wm0 at pci2 dev 1 function 0: Intel i82540EP 1000BASE-T Ethernet (rev. 0x03) snip > > became unresponsive shortly after netbooting -current. Following > reboot, it hung again while performing 'etcupdate' processing. > > > An IBM eServer x306, i386 with PCI/PCI-X busses has: > > wm0 at pci1 dev 1 function 0: Intel i82547GI 1000BASE-T Ethernet (rev. 0x00) snip > wm1 at pci3 dev 3 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x00) snip > > and booting from "wm0" hangs during multiuser boot while building the > "dev" database. > > Booting from "wm1" completes multiuser boot and I was able to complete > 'etcupdate' and 'postinstall' operations. I built several packages > (${WRKOBJDIR} on local disk), but it eventually hung too. > > > An amd64-class machine (Dell Optiplex 760) with: > > wm0 at pci0 dev 25 function 0: 82567LM-3 LAN Controller (rev. 0x02) snip > > Booting an installation built from sources around 201605182230Z, all > seemed well. Following an update to a system built from sources around > 201605201620Z, the machine hung in "/etc/rc.d/fccache" during its first > boot. Although the terminal driver responded, one could not drop into > DDB via the USB keyboard (system is USB-only until I can get the PS/2 > serial adapter cable). ACPI powerdown via the power switch hung as well, > so a forced power-off by holding the power button was required. > > On the next boot, it completed startup but hung again during 'xdm' > initialization requiring another forced power-cycle. Since disabling > 'xdm', the machine has now booted multiuser. Further stress testing > consisted of recursive 'pkg_delete' of old (GCC 4.8.5-built) packages > in preparation for rebuilding with GCC 5.3.0. It eventually hung. > > Using the same machine to investigate: > > wm0 at pci1 dev 0 function 0: Intel i82574L (rev. 0x00) snip > > booted without problems. Stress testing building packages did not > provoke a hang during the time I ran it (around 12 hours). > > Again using the same machine to investigate: > > wm1 at pci4 dev 0 function 0: Intel i82541PI 1000BASE-T Ethernet (rev. 0x05) snip > booted without problems. Stress testing building packages hung rather > soon after. > > > A Dell PowerEdge 750 (i386) with PCI/PCI-X busses and: > > wm0 at pci1 dev 1 function 0: Intel i82547GI 1000BASE-T Ethernet (rev. 0x00) snip > > wm1 at pci3 dev 2 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x00) snip > > booting from "wm0" hung during rebuilding "dev" database on first boot. > As the machine has PS/2 keyboard attached, "Ctrl-Alt-ESC" allowed dropping > to the debugger to reboot. Subsequent boots with this interface hung in > the same place. > > Booting from "wm1" completed startup and 'etcupdate'/'postinstall' > operations and completed its subsequent boot. Further stress testing > building packages ran for a long time, but halted when the machine > spontaneously shut down and would not remain powered up. > > > A Dell PowerEdge 2850 (amd64-class, PCI/PCI-X busses) with: > > wm0 at pci6 dev 7 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x05) snip > > wm1 at pci7 dev 8 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x05) snip > > booting from "wm0" hung rebuilding the "dev" database during startup. > > Booting from "wm1" passed the database rebuild but hung while updating > fontconfig cache. Subsequent reboot with "wm1" succeeded. Further > stress testing building packages eventually hung. > > > A Dell PowerEdge SC430 (amd64-class, PCI/PCIe busses) with: > > wm0 at pci2 dev 0 function 0: Intel PRO/1000 PT (82571EB) (rev. 0x06) snip > wm1 at pci2 dev 0 function 1: Intel PRO/1000 PT (82571EB) (rev. 0x06) snip > > booted from "wm0" without problems. Stress testing building packages did > not hang during my test period (about 12 hours). > > The "wm1" interface on this card is not bootable. These ethernet controllers apply to the condition which I think they can hung. However, hmm, there are many hung patterns unexpectedly. There can be another bug possibly... > Now I will update sources (if_wm.c r1.409+) rebuild and see if the > problem was fixed. I expect for if_wm.c r1.409 to resolve the problem for all of above NICs. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: systems with -current wm(4) hang?
Hi John D. Baker, On 2016/05/23 2:09, John D. Baker wrote: > On Sun, 22 May 2016, Robert Swindells wrote: > >> I think we need to know exactly which make and model of CPU are working >> or not working. > > The following PCI (not PCIe) add-on card hangs machine under -current, > works under -7 on both (so far) problem machines: > > wm0 at pci2 dev 2 function 0: Intel i82541PI 1000BASE-T Ethernet (rev. 0x05) > wm0: interrupting at ioapic0 pin 18 > wm0: 32-bit 33MHz PCI bus > wm0: 64 words (6 address bits) MicroWire EEPROM > wm0: Ethernet address 00:1b:21:9b:14:0c > igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0 > igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-FDX, auto Sorry, it would be my fault. I enbuged in the code to support earlier than 82575 (include 82541) at if_wm.c:r1.401. I fixed this possibly hang bug at if_wm.c:r1.409. Could you try this revision? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
bridge(4) and wm(4) with NET_MPSAFE is MP-scalable for now
@n.o's bridge(4) MP-scalable works, riastradh@n.o's psref(9) pslist(9) works, and joerg@n'o's if_transmit interface advices. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: PCI MSI for re(4)
Hi, On 2015/11/13 13:37, Jonathan A. Kollasch wrote: > Attached is a patch that should enable PCI MSI for pci(4)-attached re(4) > NICs. I would like both review of the code, and additional testing. > > I've tested it successfully on amd64 -current with a > 8100E/8101E/8102E/8102EL chip. Great, the patch looks good to me. I'm sorry that I cannot do additional tests as I don't have re(4) NICs. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Core Product Development Department, Product Division, Technology Unit Kengo NAKAHARA
Re: msi & gcc build problem
Hi, On 2015/04/28 15:29, Martin Husemann wrote: > On Tue, Apr 28, 2015 at 07:02:39AM +0100, Patrick Welche wrote: >> I have been building with NOGCCERROR=yes in mk.conf - thoughts on a more >> elegant solution to quell the warning? Sorry, I missed the build check... > You are probably building without options DIAGNOSTIC, so the KASSERT is > going away. > > I just fixed that file. The code looks good to me. Thank you very mach for your fixing! Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Core Product Development Department, Product Division, Technology Unit Kengo NAKAHARA
Re: point2point network interfaces cannot receive ipv6 packets
Hi, On 2015/04/03 16:14, Takahiro HAYASHI wrote: > It seems that IFF_POINTTOPOINT interfaces like tun and gif cannot > receive ipv6 packets. > This occurs on NetBSD/amd64 -current since Feb 27 2015. > > For example, establishing gif tunnnel between 2 hosts. > > [host1] <---> [host2] > 192.168.0.1 192.168.0.2 ipv4 address of real interface > fd00::1 fd00::2 gif address > > When I ping6, a host can send ICMPv6 ECHO(128), but the other host > returns ICMPv6 DST_UNREACH(1) code UNREACH_ADDR(3) to pinging host. I think the reason of this issue is below commit: http://www.nerv.org/netbsd/?q=id:20150226T095446Z.75354d997222ae09acc944ba1c6cf573c3ea724b This commit changes the route entry for gif as describe below == before == Internet6: DestinationGatewayFlagsRefs UseMtu Interface fd00::2link#13UHL 0 0 - lo0 == before == == after == Internet6: DestinationGatewayFlagsRefs UseMtu Interface fd00::2fd00::2UH - - - gif0 == after == This route change caused the function flow change in ip6_input(), in paticular the below line http://nxr.netbsd.org/xref/src/sys/netinet6/ip6_input.c#497 After above commit, this statement become false, and then, the packets is discarded through line#565. I found the above reason, however I have no idea to fix this issue... > roy@n.o Could you comment about this issue? Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, Core Product Development Department, Product Division, Technology Unit Kengo NAKAHARA