Re: requesting vinum help
On Thu, 27 Nov 2003, Greg 'groggy' Lehey wrote: > On Wednesday, 26 November 2003 at 12:04:52 -0600, Cosmin Stroe wrote: > > > > I am using vinum atm, and I am having serious problems with it. After > > about 16 hrs of writing data to a vinum volume via NFS at a constant data > > stream of 200k/sec and reading at 400k/sec at the same time, the whole > > machine just freezes, hard. The only thing I can do is reboot. This > > behavior appears in 4.8 and 5-CURRENT. I have no indication of what is > > wrong, or how to go about finding it out. The problem is either with NFS > > or Vinum, and I'm leaning towards Vinum (because of the failure in both > > -STABLE and -CURRENT). > > > > I'm not the kind of person that relies on other people, and I like to fix > > my own problems, but this is a problem which I cannot fix at this time. > > So, I'm planning to look through the code of vinum and start messing with > > it to figure out how it works and how to debug it. > > This is unlikely to get you very far. Some more details (offline if > you prefer) would be handy, but as you say, you can't even be sure > that it's Vinum. The best thing would be to get the system into the > kernel debugger at the point of freeze, if that's possible, and try to > work out what has happened. > Quick question: If this is a software problem with vinum, there should be no way it can hard lock a machine. Is this assumption correct ? I should be able to invoke the kernel debugger by pressing the hotkey (ctrl+alt+esc) while the machine is locked and get a backtrace (altho i'd be in an ISR servicing the hotkey, so i'm not sure it'd do much good). Any special suggestions on debugging this kind of freezing problem ? The hardware has been tested and it's good (CPU,RAM,HDs). (some kind of watchdog in software ??) > > What would also be appreciated is an overall "map" of how vinum is > > organized and how it works. > > You've read the documentation on http://www.vinumvm.org/, right? If > you have any questions, I'm sure it can be improved on. > Yes :). > Greg > -- > See complete headers for address and phone numbers. > Cosmin Stroe. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: requesting vinum help
On Wed, 26 Nov 2003, Poul-Henning Kamp wrote: > In message <[EMAIL PROTECTED]>, "Joel M. Baldwin" writes: > > >I was trying to use some restraint and not rant and rave in public like > >I wanted to do. I'm rather miffed that nothing appeared in UPDATING. > >Rather than an unproductive public RANT I thought I'd ask for private assistance. > >I can post a summary afterwards if you like, or even better write a better > >FAQ/tutorial on vinum. > > Joel, > > The problem is that vinum is hot political potato in the project. > > In the eyes of a fair number of competent people, vinum has never > quite "made it". I think most of them have given it a shot and > lost data to it. Some of them, after looking in the code to "fix > the problem", said "never again!" and now hate vinum of a good > heart. > > Greg has disclaimed maintainership of vinum some time ago for reasons > of politics, and he now is of the opinion that it is everybodys > (elses) task to maintain vinum. Everybody else disagree and belive > that "vinum is very much Gregs own problem". > > With Greg being a core@ member, and well known for his ability to > talk an acturan megadonkey into taking a stroll after first having > talked its legs off about procedural issues, "Doing something about > vinum" is permanently on the "we should really..." list and everybody > hopes somebody else will "deal with it". Of course, in the end > nobody does. > > As matters stand, we are doing our users a disservice by continuing > to pretend everything is OK when in fact it is not at all. > > Personally, I think vinum(8) should not be in our 5-STABLE featureset > if it is not brought up to current standards and actively maintained. > > But at the very least we should have the release notes reflect that > vinum is unmaintained and belived to unreliable and have vinum(8) > issue a very stern warning to people along those lines. > > I'm sure that a major bikeshed will now ensue and people will argue > that there is a lot more to this dispute than what I've said above. > > They're right of course, this is a very short summary :-) > > Poul-Henning > > I am using vinum atm, and I am having serious problems with it. After about 16 hrs of writing data to a vinum volume via NFS at a constant data stream of 200k/sec and reading at 400k/sec at the same time, the whole machine just freezes, hard. The only thing I can do is reboot. This behavior appears in 4.8 and 5-CURRENT. I have no indication of what is wrong, or how to go about finding it out. The problem is either with NFS or Vinum, and I'm leaning towards Vinum (because of the failure in both -STABLE and -CURRENT). I'm not the kind of person that relies on other people, and I like to fix my own problems, but this is a problem which I cannot fix at this time. So, I'm planning to look through the code of vinum and start messing with it to figure out how it works and how to debug it. This is how important Vinum is to me at the moment. I'm not a kernel coder, or an intense coder in general (but I'm proficient in C/C++, and have used FreeBSD for quite some years now), so I'm reading the Kernel Developer's Handbook as a starting point. If anyone has other online documentation on FreeBSD Kernel programming, it would be much appreciated. What would also be appreciated is an overall "map" of how vinum is organized and how it works. Otherwise, I'll have to painstaikingly go through the code and figure everything out little by little (which I plan to do, but if you know how Vinum works, everything is much easier, makes sense right away, and takes less time). Thank you in advance. Cosmin Stroe. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
LOR (swap_pager.c:1323, swap_pager.c:1838, uma_core.c:876) (current:Nov17)
Here is the stack backtrace: lock order reversal 1st 0xc1da318c vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 2nd 0xc0724900 swap_pager swhash (swap_pager swhash) @ /usr/src/sys/vm/swap_pager.c:1838 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 Stack backtrace: backtrace(c0692be9,c0c358c4,c06a376c,c06a376c,c06a464d) at backtrace+0x17 witness_lock(c0c358c4,8,c06a464d,36c,1) at witness_lock+0x672 _mtx_lock_flags(c0c358c4,0,c06a464d,36c,1) at _mtx_lock_flags+0xba obj_alloc(c0c22480,1000,c976f9db,101,c06f3f50) at obj_alloc+0x3f slab_zalloc(c0c22480,1,c06a464d,68c,c0c22494) at slab_zalloc+0xb3 uma_zone_slab(c0c22480,1,c06a464d,68c,c0c22520) at uma_zone_slab+0xd6 uma_zalloc_internal(c0c22480,0,1,5c1,72e,c06f55a8) at uma_zalloc_internal+0x3e uma_zalloc_arg(c0c22480,0,1,72e,2) at uma_zalloc_arg+0x3ab swp_pager_meta_build(c1da318c,7,0,2,0) at swp_pager_meta_build+0x174 swap_pager_putpages(c1da318c,c976fbb8,8,0,c976fb20) at swap_pager_putpages+0x32d default_pager_putpages(c1da318c,c976fbb8,8,0,c976fb20) at default_pager_putpages+0x2e vm_pageout_flush(c976fbb8,8,0,0,c06f36a0) at vm_pageout_flush+0x17a vm_pageout_clean(c0dae2d8,0,c06a4468,32a,0) at vm_pageout_clean+0x305 vm_pageout_scan(0,0,c06a4468,5a9,1f4) at vm_pageout_scan+0x65f vm_pageout(0,c976fd48,c068d4ed,311,0) at vm_pageout+0x31b fork_exit(c0625250,0,c976fd48) at fork_exit+0xb4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc976fd7c, ebp = 0 --- Debugger("witness_lock") Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 db> I'm running the sources from yesterday, nov 17: FreeBSD 5.1-CURRENT #0: Mon Nov 17 06:40:05 CST 2003 root@:/usr/obj/usr/src/sys/GALAXY ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: checking stopevent 2!
On Sat, Nov 15, 2003 at 09:38:37AM -0500, Robert Watson wrote: > > On Sat, 15 Nov 2003, Andy Farkas wrote: > > would probably be useful if you could drop to DDB and generate a trace for > the event. > I've done that, in this email message: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=2157067+0+current/freebsd-current > > > > ... > > Nov 15 16:05:44 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:44 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/kern_condvar.c:289 > > Nov 15 16:05:44 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:44 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > Nov 15 16:05:44 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:45 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > Nov 15 16:05:45 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:45 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4663aa8) locked @ /hummer/src-current/src/sys/kern/kern_synch.c:293 > > Nov 15 16:05:45 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:45 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4663aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > Nov 15 16:05:45 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:45 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4663aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > Nov 15 16:05:45 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:45 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/kern_condvar.c:289 > > Nov 15 16:05:45 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:46 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > Nov 15 16:05:46 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:46 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/kern_condvar.c:289 > > Nov 15 16:05:46 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:46 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > Nov 15 16:05:46 hummer kernel: checking stopevent 2 with the following > > non-sleepable locks held: > > Nov 15 16:05:46 hummer kernel: exclusive sleep mutex sigacts r = 0 > > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260 > > ... > > > > > > > > This is latest -current (cvsup'd a few hours ago) > > > > > > -- > > > > :{ [EMAIL PROTECTED] > > > > Andy Farkas > > System Administrator > >Speednet Communications > > http://www.speednet.com.au/ > > > > > > ___ > > [EMAIL PROTECTED] mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "[EMAIL PROTECTED]" > > > > ___ > [EMAIL PROTECTED] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "[EMAIL PROTECTED]" Cosmin Stroe ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
exclusive sleep mutex ... /usr/src/sys/kern/kern_synch.c:293
: port 0xe800-0xe80f,0xe400-0xe403,0xe000-0xe007,0xdc00-0xdc03,0xd800-0xd807 mem 0xdc00-0xdc003fff irq 11 at device 10.0 on pci0 atapci1: [MPSAFE] ata2: at 0xd800 on atapci1 ata2: [MPSAFE] ata3: at 0xe000 on atapci1 ata3: [MPSAFE] orm0: at iomem 0xc8000-0xca7ff,0xc-0xc7fff on isa0 atkbdc0: at port 0x64,0x60 on isa0 atkbd0: flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: failed to get data. psm0: irq 12 on atkbdc0 psm0: model IntelliMouse, device ID 3 fdc0: at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: parallel port not found. sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 unknown: can't assign resources (port) unknown: can't assign resources (irq) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) Timecounter "TSC" frequency 1100046119 Hz quality 800 Timecounters tick every 10.000 msec ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to acc ept, logging disabled GEOM: create disk ad0 dp=0xc1c53c60 ad0: 19092MB [38792/16/63] at ata0-master UDMA66 acd0: DVDROM at ata1-slave PIO4 GEOM: create disk ad4 dp=0xc1c53a60 ad4: 29196MB [59320/16/63] at ata2-master UDMA66 Mounting root from ufs:/dev/ad0s1a Loading configuration files. 00400 reject tcp from any to any dst-port 161 via sis0 Entropy harvesting: interrupts ethernet point_to_point. kernel dumps on /dev/ad0s1b swapon: adding /dev/ad0s1b as swap device Starting file system checks: /dev/ad0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1a: clean, 9028 free (372 frags, 1082 blocks, 0.6% fragmentation) /dev/ad0s1e: DEFER FOR BACKGROUND CHECKING /dev/ad0s1d: DEFER FOR BACKGROUND CHECKING /dev/ad4s1: DEFER FOR BACKGROUND CHECKING /dev/ad0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad0s1f: clean, 1965768 free (79008 frags, 23584W5 blocks, 0.9% fAragmentation) RNING: /tmp was not properly dismounted WARNING: /var was not properly dismounted /var: superblock summary recomputed WARNING: /mnt/ftp was not properly dismounted debug.witness_ddb: 0 -> 1 Setting hostname: cosmin.phy.uic.edu. nge0: gigabit link up lo0: flags=8049 mtu 16384 inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 nge0: flags=8843 mtu 1500 options=13 inet 131.193.192.26 netmask 0xff00 broadcast 131.193.192.255 inet6 fe80::250:baff:fe39:6d6%nge0 prefixlen 64 tentative scopeid 0x1 ether 00:50:ba:39:06:d6 media: Ethernet autoselect (none) status: no carrier 00400 reject tcp from any to any dst-port 161 via sis0 db> panic panic: from debugger syncing disks, buffers remaining... 453 453 453 453 453 453 453 453 453 453 453 453 453 453 453 453 453 453 453 453 giving up on 409 buffers Uptime: 7s Dumping 128 MB 16 32 48 64 80 96 112 Dump complete Flushed all rules. 00100 allow ip from any to any via lo0 00200 deny ip from any to 127.0.0.0/8 00300 deny ip from 127.0.0.0/8 to any 65000 allow ip from any to any Firewall rules loaded, starting divert daemons:. net.inet.ip.fw.enable: 1 -> 1 add net default: gateway 131.193.192.1 Additional routing options:. hw.bus.devctl_disable: 0 -> 1 Mounting NFS file systems:. Starting syslogd. Nov 14 19:38:26 syslogd: /var/log/debug.log: No such file or directory Nov 14 19:38:26 cosmin syslogd: kernel boot file is /boot/kernel/kernel checking stopevent 2 with the following non-sleepable locks held: exclusive sleep mutex sigacts r = 0 (0xc1cb8aa8) locked @ /usr/src/sys/kern/kern_synch.c:293 Debugger("witness_warn") Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 db> trace Debugger(c0675228,c93f4b88,1,c93f4b84,0) at Debugger+0x54 witness_warn(5,c1c0fcc8,c068d494,2,c06f37a0) at witness_warn+0x19f issignal(c1bb8dc0,2,c068fc5b,bd,c1c0fcc8) at issignal+0x16b cursig(c1bb8dc0,0,c0690152,125,1) at cursig+0xe8 msleep(c1c0fc5c,c1c0fcc8,15c,c068fb80,0) at msleep+0x631 wait1(c1bb8dc0,c93f4d10,0,c93f4d40,c065bca0) at wait1+0x990 wait4(c1bb8dc0,c93f4d10,c06a868e,3ee,4) at wait4+0x20 syscall(2f,2f,2f,bfbfeec0,bfbfeec0) at syscall+0x2e0 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (7, FreeBSD ELF32, wait4), eip = 0x280d0b1f, esp = 0xbfbfe84c, ebp = 0xbfbfe868 --- db> Cosmin Stroe ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"