Re: bsnmpd always died on HDD detach
On Wed, Sep 12, 2012 at 10:39:12AM +0200, Miroslav Lachman wrote: (gdb) bt #0 0x000801046cba in disk_query_disk (entry=0x0) at hostres_diskstorage_tbl.c:241 #1 0x000801dd6a00 in ?? () #2 0x000801dd6600 in ?? () #3 0x in ?? () #4 0x000801048230 in device_entry_create (name=0x0, location=0x800c14ee0 0, descr=0x8010482a6 ) at hostres_device_tbl.c:217 #5 0x000801dd7800 in ?? () #6 0x000801dd7800 in ?? () #7 0x000801dd7400 in ?? () #8 0x in ?? () #9 0x000801048230 in device_entry_create (name=0x801dd7c00 , location=0x801048230 ˙˙I\213|$8čŕ\201˙˙L\211çčŘ\201˙˙é\035ţ˙˙H\215\025, descr=0x8010482a6 ) at hostres_device_tbl.c:217 #10 0x000801dd4a00 in ?? () #11 0x000801dd4a00 in ?? () #12 0x000801dd1a00 in ?? () #13 0x in ?? () #14 0x000801048230 in device_entry_create (name=0x801dd8400 , location=0x801048230 ˙˙I\213|$8čŕ\201˙˙L\211çčŘ\201˙˙é\035ţ˙˙H\215\025, descr=0x8010482a6 ) at hostres_device_tbl.c:217 #15 0x000801dd1800 in ?? () #16 0x000801dd1800 in ?? () #17 0x000800c00ea8 in ?? () #18 0x0051b1c8 in ?? () #19 0x000800c00938 in ?? () #20 0x0051b258 in ?? () #21 0x000801dc8a00 in ?? () #22 0x0008009f7be9 in free () from /lib/libc.so.7 #23 0x in ?? () #24 0x7fffed98 in ?? () #25 0x0008010478bd in device_entry_delete () at hostres_device_tbl.c:266 #26 0x005187d0 in snmp_error () #27 0x000801047be6 in op_hrDeviceTable (ctx=Variable ctx is not available. ) at hostres_device_tbl.c:671 #28 0x0051b840 in ?? () #29 0x0051b830 in ?? () #30 0x in ?? () #31 0x7fffc360 in ?? () #32 0x0051b830 in ?? () #33 0x in ?? () #34 0x0008009efbd2 in _pthread_mutex_init_calloc_cb () from /lib/libc.so.7 #35 0x0008009f2d32 in _malloc_prefork () from /lib/libc.so.7 #36 0x0008009f6e1f in realloc () from /lib/libc.so.7 #37 0x000800e0b441 in mib_if_is_dyn () from /usr/lib/snmp_mibII.so #38 0x in ?? () #39 0x7fffc5cc in ?? () #40 0x0001 in ?? () #41 0x7fffc5e0 in ?? () #42 0x31fa39e2fac72819 in ?? () #43 0x0001 in ?? () #44 0x00080065fad5 in poll_dispatch () from /lib/libbegemot.so.4 #45 0x0040616a in main () I hope it helps you to debug this problem. Looks like we can't trust to this output. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Thinkpad X61s cannot boot 9.1-BETA1
On Wed, Sep 12, 2012 at 11:08:25PM +0300, Alexander Motin wrote: On 12.09.2012 22:58, Lars Engels wrote: On Wed, Sep 12, 2012 at 09:58:31PM +0300, Alexander Motin wrote: On 12.09.2012 20:46, Lars Engels wrote: On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote: on 12/09/2012 20:25 Lars Engels said the following: On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote: Could you try to play with different eventtimer settings (preferably in current) ? You can use this thread / PR as a guide: http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495 The place where boot stop looks suspiciously close to the place where timer interrupts should start driving the system. Yes, that's it! Setting kern.eventtimer.timer=i8254 let's the Thinkpad boot on CURRENT with the AC cable inserted. Please share your sysctl kern.eventtimer output with Alexander. He will probably ask for some additional information :-) Sorry if I've missed, but it would be useful to see verbose dmesg in situation where system couldn't boot without switching eventtimer. No problem. See: http://bsd-geek.de/FreeBSD/IMAG0190.jpg No, I've seen that one and I don't mean it. I mean full verbose dmesg of successful boot in conditions where system was not booting before without setting kern.eventtimer.timer=i8254. Ok, sorry. Here's a verbose dmesg booting CURRENT without AC power: http://bsd-geek.de/FreeBSD/T61_dmesg.boot.works pgpia63ZNvh1G.pgp Description: PGP signature
Re: GEOM_RAID in GENERIC is harmful
On 13.09.2012 08:31, Eugene Grosbein wrote: 9-STABLE has got options GEOM_RAID in GENERIC. In real world, this change is pretty harmful and there are lots of cases when 9.0-RELEASE systems upgraded to 9-STABLE fail to mount root UFS filesystem or attach ZFS. It seems, there are lots of HDDs supplied with pseudo-RAID labels at the end: pre-installed Windows machined having motherboards with pseudo-RAID like Intel RapidStore and alike. One can not even be aware of these labels. 9.0-RELEASE can be installed on such HDDs and use them with GMIRROR or ZFS without a problem. Upgraded to 9-STABLE, such system fails to build due to GRAID jumping out of box and grabbing HDDs for itself, so GMIRROR or ZFS got broken. That's makes users very angry when production server fails to boot with GENERIC kernel after correctly performed upgrade. GEOM_RAID compiled in GENERIC should be deactivated and require activation with some loader knob. Also, we need distinct RELEASE NOTES warning about the issue. Problem of on-disk metadata garbage is not limited to GEOM_RAID. For example, I had case where remainders of old UFS file system were found by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT partition, making other partitions inaccessible. Does it mean we should remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty in is that it was not in place for 9.0 release. If we remove it now, it will just postpone the problem for later time or will never be able to add it again because of the same reasons. Unlike GEOM_LABEL, metadata of GEOM_RAID is quite easy to delete without complete disk erase: `graid status -ag`, `graid delete ...`. Yes, it can be a problem if system can't boot, but now we at least have live mode on installation images, that should allow to do it. Adding some loader tunables indeed could simplify recovery in case of boot problem. I will probably add such ones now. It won't hurt. But I disagree they should be disabled by default, limiting users who really want to use BIOS RAID. Disabling them will also make metadata removal without full wipe more difficult because different RAIDs have different on-disk metadata layout, and you should know where exactly to apply dd. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC is harmful
13.09.2012 16:51, Alexander Motin wrote: That's makes users very angry when production server fails to boot with GENERIC kernel after correctly performed upgrade. GEOM_RAID compiled in GENERIC should be deactivated and require activation with some loader knob. Also, we need distinct RELEASE NOTES warning about the issue. Problem of on-disk metadata garbage is not limited to GEOM_RAID. For example, I had case where remainders of old UFS file system were found by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT partition, making other partitions inaccessible. Does it mean we should remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty in is that it was not in place for 9.0 release. If we remove it now, it will just postpone the problem for later time or will never be able to add it again because of the same reasons. We must be ready for lots of angry users of 9.1-RELEASE then and have BIG RED WARNING in RELEASE NOTES. Eugene Grosbein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC is harmful
Hi. On 13.09.2012 15:51, Alexander Motin wrote: Problem of on-disk metadata garbage is not limited to GEOM_RAID. For example, I had case where remainders of old UFS file system were found by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT partition, making other partitions inaccessible. Does it mean we should remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty in is that it was not in place for 9.0 release. If we remove it now, it will just postpone the problem for later time or will never be able to add it again because of the same reasons. Unlike GEOM_LABEL, metadata of GEOM_RAID is quite easy to delete without complete disk erase: `graid status -ag`, `graid delete ...`. Yes, it can be a problem if system can't boot, but now we at least have live mode on installation images, that should allow to do it. Adding some loader tunables indeed could simplify recovery in case of boot problem. I will probably add such ones now. It won't hurt. But I disagree they should be disabled by default, limiting users who really want to use BIOS RAID. Disabling them will also make metadata removal without full wipe more difficult because different RAIDs have different on-disk metadata layout, and you should know where exactly to apply dd. From my point of view, the policy of new features should be like that: new features introduced to the system should by default try to mimic the old behavior. Right now we will have a situation when most of the users will just upgrade to the new kernel, and will get a non-bootable system or a system with one 100% busy disk (for example degraded raid0 gives this). On a system that manages to boot up 'graid delete -f' could lead to a livelock (got it today, on a degraded raid1). Furthermore, the situation when the engineer forgot about a disk with a glabel/gmirror data is less probable than a situation when you have a 'new' disk from another department which was extracted from some windows server or workstation. Should I test all of the disks against graid labels ? Yeah, may be. But for X last years I didn't do that, just because it worked for me and it didn't lead to a crash. The softraid labels were harmless all the way. I could use a zpool or a gmirror without even knowing that I have them. Now I suddenly need to care about the labels. Is GEOM_RAID great, as a feature ? Yep, it is. Is the way it is introduced into the system that great ? Not at all. From my point of view GEOM_RAID in GENERIC kernel is a bomb, and we will lose lots of FreeBSD beginners due to this. Eugene. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC is harmful
On 13.09.2012 13:01, Eugene Grosbein wrote: 13.09.2012 16:51, Alexander Motin wrote: That's makes users very angry when production server fails to boot with GENERIC kernel after correctly performed upgrade. GEOM_RAID compiled in GENERIC should be deactivated and require activation with some loader knob. Also, we need distinct RELEASE NOTES warning about the issue. Problem of on-disk metadata garbage is not limited to GEOM_RAID. For example, I had case where remainders of old UFS file system were found by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT partition, making other partitions inaccessible. Does it mean we should remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty in is that it was not in place for 9.0 release. If we remove it now, it will just postpone the problem for later time or will never be able to add it again because of the same reasons. We must be ready for lots of angry users of 9.1-RELEASE then and have BIG RED WARNING in RELEASE NOTES. Warning is good, but I don't think it will be lots. It is enabled in 9-STABLE for some time now and I haven't seen many complains. If re@ permit to MFC r240465 in few days, solution for those who may need it will be simple: kern.geom.raid.enable=0. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC is harmful
On Thursday, September 13, 2012 6:13:51 am Eugene M. Zheganin wrote: From my point of view GEOM_RAID in GENERIC kernel is a bomb, and we will lose lots of FreeBSD beginners due to this. I had the completely opposite experience. I bought a new desktop and wanted to use the onboard SATA RAID. 9.0 didn't work out-of-the-box with a RAID-1 volume configured using the BIOS. I knew to kldload geom_raid.ko, but not all new users know to do that. I think the onboard SATA RAID on typical x86 motherboards is something we should be supporting out of the box. I don't disagree that there were some surprising side effects from enabling GEOM_RAID, but I think your viewpoint is very much one-sided. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Thinkpad X61s cannot boot 9.1-BETA1
On 13.09.2012 10:44, Lars Engels wrote: On Wed, Sep 12, 2012 at 11:08:25PM +0300, Alexander Motin wrote: On 12.09.2012 22:58, Lars Engels wrote: On Wed, Sep 12, 2012 at 09:58:31PM +0300, Alexander Motin wrote: On 12.09.2012 20:46, Lars Engels wrote: On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote: on 12/09/2012 20:25 Lars Engels said the following: On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote: Could you try to play with different eventtimer settings (preferably in current) ? You can use this thread / PR as a guide: http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495 The place where boot stop looks suspiciously close to the place where timer interrupts should start driving the system. Yes, that's it! Setting kern.eventtimer.timer=i8254 let's the Thinkpad boot on CURRENT with the AC cable inserted. Please share your sysctl kern.eventtimer output with Alexander. He will probably ask for some additional information :-) Sorry if I've missed, but it would be useful to see verbose dmesg in situation where system couldn't boot without switching eventtimer. No problem. See: http://bsd-geek.de/FreeBSD/IMAG0190.jpg No, I've seen that one and I don't mean it. I mean full verbose dmesg of successful boot in conditions where system was not booting before without setting kern.eventtimer.timer=i8254. Ok, sorry. Here's a verbose dmesg booting CURRENT without AC power: http://bsd-geek.de/FreeBSD/T61_dmesg.boot.works Hmm. I see nothing suspicious. HPET driver output is typical for ICH8M chipset, many of which are working fine in different systems, including several mine. There was no significant changes in HPET after 9.0-RELASE except r231161. It changed device probe order that increased chance of interrupt sharing. It should not be a problem, but who knows. You can try to hint HPET driver specific IRQ 23 (that looks unused) to avoid sharing by setting hint.hpet.0.allowed_irqs=0x0080. You've told that problem related to AC power state. Have you compared dmesg outputs with and without it? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Userland dtrace broken?
On Wed, Aug 29, 2012 at 14:01:15 +0100 , Matt Burke wrote: Following http://wiki.freebsd.org/DTrace/userland on 9.1-RC1, the example fails to work as demonstrated: # dtrace -s pid.d -c test dtrace: script 'pid.d' matched 2 probes CPU IDFUNCTION:NAME 1 59284 main:entry dtrace: pid 25479 exited with status 1 # Also, I get hangs when trying to do pretty much anything with the pid:::entry # dtrace -n 'pid$target::malloc:entry' -c 'echo x' dtrace: description 'pid$target::malloc:entry' matched 2 probes xCPU IDFUNCTION:NAME 1 59311 malloc:entry load: 0.43 cmd: dtrace 63737 [running] 8.93r 1.60u 4.19s 35% 25072k load: 0.88 cmd: echo 63738 [running] 45.10r 2.27u 18.75s 47% 1452k load: 1.19 cmd: dtrace 63737 [running] 70.32r 12.14u 33.27s 64% 25072k # procstat -k 63737 63738 PIDTID COMM TDNAME KSTACK 63737 101505 dtrace -mi_switch sleepq_catch_signals sleepq_timedwait_sig _sleep do_wait __umtx_op_wait_uint_private amd64_syscall Xfast_syscall 63737 111024 dtrace -running 63738 101657 echo -mi_switch thread_suspend_switch ptracestop cursig ast doreti_ast I have previously tried using dtrace on 9.0R, but it was insta-panic. Is there anything I may have missed here? make.conf: STRIP= CFLAGS+=-fno-omit-frame-pointer WITH_CTF=1 kernel config: include GENERIC ident DTRACE makeoptions DEBUG=-g makeoptions WITH_CTF=1 options KDTRACE_FRAME options KDTRACE_HOOKS options DDB_CTF options DDB Relevant to my interests, too. I've followed the instructions on the wiki / in the handbook (on 9.0/9.1-PRE) and only receive error messages. Is DTrace supposed to be working properly on 9.x, or is it still experimental? It's nice to say that FreeBSD nominally supports DTrace, but if it doesn't actually work then it needs to be labelled as such. I am fine with it being experimental if that's the case, but saying so would help manage expectations a lot better. -- Thanks and best regards, Chris Nehren pgpRAM1lTvPSz.pgp Description: PGP signature
Re: Issue with igb and lagg (was Re: Problem with link aggregation + sshd)
On 09/12/2012 10:51 PM, Freddie Cash wrote: On Wed, Sep 12, 2012 at 1:48 PM, Jack Vogel jfvo...@gmail.com wrote: On Wed, Sep 12, 2012 at 12:40 PM, Freddie Cash fjwc...@gmail.com wrote: Thanks for checking. I've used lagg(4) with igb, just not on 9.x. You're right, it seems to be pointing to the igb(4) driver in 9.x compared to 9.0. How do you determine that since it doesn't happen without lagg? I've no reports of igb hanging otherwise and its being used extensively. Well, I did say seems to. :) igb+lagg worked for us on 8.3. Haven't tried it since moving to 9.0 and 9-STABLE on those three boxes. igb+lagg doesn't work for him on 9.0. Although, I don't recall if non-LACP options were tried earlier in this thread, or if it's just the LACP mode that's failing. If one mode works (say failover) and LACP mode doesn't, that seems to point to lagg. Sorry, forgot to mention it. I tried both failover and lacp: neither worked. The switch is a Dell powerconnect 6248 with ports configured for aggragation. I first tried on a 9.1 prerelease, then on a 9.0 release to have everything clean. In both ssh, both as server and as client, become unresponsive and unkillable. The problem might also lie within ssh/d, but I somehow doubt it. I haven't tried other network services. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org