Re: CURRENT: re(4) crashing system
On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > I tried to report earlier here that CURRENT does have some serious > problems right now and one of those problems seems to be triggered by > the recent re(4) driver. The problem is also present in recen 11-STABLE! > > Below, you'll find pciconf-output reagrding the device on a Lenovo E540 > Laptop I can test on and trigger the problem. > > The phenomenon is that this NIC does not negotiate 1000baseTX, it is > always falling back to 100baseTX although the device claims to be a 1 > GBit capable device. > > When I try to put the device manually into 1000basTX mode via > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) > > it is possible to crash the system. The system also crashes when > plugging/unplugging the LAN cord - I guess the renegotiation is > triggering this crash immediately. > > I tried with several switches and routers capable of 1 GBit and it > seems to be independent from the network hardware in use. > > I tried to capture a backtrace when the kernel crashes, but I do not > know how to save the the kernel debugger output. Although I configured > according the handbook debugging, there is no coredump at all. > > Advice is appreciated - if anybody is interesetd in solving this. > There were several instability reports on re(4). I vaguely guess it would be related with some missing initializations for certain controllers. Unfortunately, there is no publicly available datasheet for those controllers and it's not likely to get access to it in near future. It seems vendor's FreeBSD driver accesses lots of magic registers as well as loading DSP fixups. I have no idea what it wants to do and re(4) used to heavily rely on power-on default register values. Engineering samples I have do not show instabilities so it wouldn't be easy to identify the issue. Probably the first step to address the issue would be identifying those chips and narrowing down the scope of guessing. Would you show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) output is useless here since RealTek uses the same PCI id for PCIe variants. BTW, I was told that the vendor's FreeBSD driver seems to work fine for normal usage pattern. The vendor's driver triggered an instant panic and lacked H/W offloading features in the past. It might have changed though. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: was: CURRENT [r307305]: r307823 still crashing
On Sun, 23 Oct 2016, O. Hartmann wrote: > How can I track a memory leak? I think I did not read enough of the context, but vmstat and top can track memory usage as a general thing. > How can I write to disk the backtrace given by the debugger when > crashing? My box I can freely test is using the nVidia BLOB and vt(), so > I can not see the backtrace. I got a very bad screenshot on one of my > laptops, but its so ugly/unreadable, I think it is unsuable to be > presented within this list at a reasonable size (200 kB max ist too > small). The backtrace should be part of the crash dump that is written to the (directly connected, non-encrypted, non-USB) swap device. "call doadump" at the debugger prompt (even typing blind) is supposed to make sure there's a dump taken. With respect to the screenshot, you should be able to post the image on an external site and send a link to the list, at least. -Ben ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
was: CURRENT [r307305]: r307823 still crashing
Am Sat, 15 Oct 2016 12:13:21 +0200 "O. Hartmann"schrieb: > Am Sat, 15 Oct 2016 10:22:42 +0200 > "O. Hartmann" schrieb: > > > Am Fri, 14 Oct 2016 10:48:33 +0200 > > "O. Hartmann" schrieb: > > > > > Systems I updated to recent CURRENT start crashing spontaneously. > > > > > > recent crashing system is on > > > 12.0-CURRENT FreeBSD 12.0-CURRENT #11 r307305: Fri Oct 14 08:37:59 CEST > > > 2016 > > > > > > other (no access since it is remote and not accessible until later the > > > day) has > > > been updated ~ 12 hours ago and it is alos rebooting/crashing without any > > > warnings. Can be triggered on heavy load. > > > > > > Only system with r307263 and stable so far is an older two-socket XEON > > > Core2Duao based machine, all crashing boxes have CPUs newer or equal than > > > IvyBridge. > > > > > > Does anyone also see these crashes? I tried to compile a debug kernel on > > > one > > > host, but that's the remote machine I have access to later, it failed > > > compiling > > > the kernel - under load it crashed often. After ZFS scrubbing kickied in, > > > it > > > vanished from the net ;-/ > > > > > > kind regards, > > > oh > > > ___ > > > freebsd-current@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > > To unsubscribe, send any mail to > > > "freebsd-current-unsubscr...@freebsd.org" > > > > Still 307341 is crashing undpredicted ( FreeBSD 12.0-CURRENT #5 r307341: > > Sat Oct 15 > > 09:36:16 CEST 2016). > > > > I'm back to r307157, which seems to be "stable". > > > > Seems, I'm the only one at the moment having those problems :-( > > I now have a laptop avalable and start putting debugging options into the > kernel. But > the laptop, so far, doesn't expose the problems of crashes described above. > The laptop > is the only system so far without ZFS! > > The most frequent crashing box is a CURRENT server with the largest ZFS > volume. When on > most recent CURRENT (>r307157, see above), starting a scrubbing on a RAIDZ > volume with ~ > 12 TB brutto size AND running a poudriere job, triggers the crash every 1 - > 18 minutes. > Another box with only /home as ZFS volume on a dedicated hdd crashes after > minutes or > hours. A laptop, also CURRENT (now at r307349) without ZFS is working stable > as long as > I do not pull the LAN wire (a problem I described also in the list, I try to > capture the > screen when crashing right now). I spent now the last three days trying to figure out whether my custom config is faulty or CURRENT has a serious bug. Even with GENERIC and in single user mode (it takes then longer) CURRENT, now at r307823, is crashing. The crashes seem to be unrelated to X11, but I can trigger this crash faster when using firefox. I also can trigger it faster when doing a "svn update" on a ZFS pool containing /usr/ports. Everyone who uses ZFS on /usr/src or /usr/ports and updates via subversion knows that over time the update process takes 10 - 15 minutes on ZFS volumes - compared to several minutes on UFS. And while svn traverses the folder /usr/ports, the crash occurs. I'm still wondering about the fact nobody else is facing such a periodically crashing. The crash is, I already reported this, with CURRENT on several boxes with or without ZFS. How can I track a memory leak? How can I write to disk the backtrace given by the debugger when crashing? My box I can freely test is using the nVidia BLOB and vt(), so I can not see the backtrace. I got a very bad screenshot on one of my laptops, but its so ugly/unreadable, I think it is unsuable to be presented within this list at a reasonable size (200 kB max ist too small). pgpQuQqcwcZwq.pgp Description: OpenPGP digital signature
CURRENT: re(4) crashing system
I tried to report earlier here that CURRENT does have some serious problems right now and one of those problems seems to be triggered by the recent re(4) driver. The problem is also present in recen 11-STABLE! Below, you'll find pciconf-output reagrding the device on a Lenovo E540 Laptop I can test on and trigger the problem. The phenomenon is that this NIC does not negotiate 1000baseTX, it is always falling back to 100baseTX although the device claims to be a 1 GBit capable device. When I try to put the device manually into 1000basTX mode via ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) it is possible to crash the system. The system also crashes when plugging/unplugging the LAN cord - I guess the renegotiation is triggering this crash immediately. I tried with several switches and routers capable of 1 GBit and it seems to be independent from the network hardware in use. I tried to capture a backtrace when the kernel crashes, but I do not know how to save the the kernel debugger output. Although I configured according the handbook debugging, there is no coredump at all. Advice is appreciated - if anybody is interesetd in solving this. Thank you very much in advance and kind regards, Oliver [...] re0@pci0:3:0:0: class=0x02 card=0x502817aa chip=0x816810ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet bar [10] = type I/O Port, range 32, base 0x3000, size 256, enabled bar [18] = type Memory, range 64, base 0xf0d04000, size 4096, enabled bar [20] = type Memory, range 64, base 0xf0d0, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 2 endpoint MSI 1 max data 128(128) RO link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) cap 11[b0] = MSI-X supports 4 messages, enabled Table in map 0x20[0x0], PBA in map 0x20[0x800] cap 03[d0] = VPD ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected ecap 0002[140] = VC 1 max VC0 ecap 0003[160] = Serial 1 0100684ce000 ecap 0018[170] = LTR 1 ecap 001e[178] = unknown 1 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: installworld fails on missing tzsetup when WITHOUT_DIALOG is set
On Sun, Oct 23, 2016 at 11:41:23AM +0300, Guy Yur wrote: > On Sat, Oct 22, 2016 at 7:23 PM, Baptiste Daroussinwrote: > > On Sat, Oct 22, 2016 at 06:51:28PM +0300, Guy Yur wrote: > >> Hi, > >> ... > > > > My proposal is a bit different: build tzsetup without dialog support :) > > > > https://reviews.freebsd.org/D8325 > > > > Best regards, > > Bapt > > Thanks. FYI it is in Best regards, Bapt signature.asc Description: PGP signature
Re: installworld fails on missing tzsetup when WITHOUT_DIALOG is set
On Sat, Oct 22, 2016 at 7:23 PM, Baptiste Daroussinwrote: > On Sat, Oct 22, 2016 at 06:51:28PM +0300, Guy Yur wrote: >> Hi, >> ... > > My proposal is a bit different: build tzsetup without dialog support :) > > https://reviews.freebsd.org/D8325 > > Best regards, > Bapt Thanks. Guy ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"