Re: FreeBSD based bandwidth manager, traffic shaper
I am looking high performance bandwidth manager, traffic shaper for IP core network to configure leased line, xDSL, Ethernet, GPON/EPON, wireless subscribers. Is there any FreeBSD based solution? Juniper :-) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load
Anyway, I looked at the ddb output already, said that it looks as either driver or hw problem with very high confidence. I think the time of the project could be spent more productive elsewere, while submitter checks his hardware, for instance, by changing controller, disks, or controller type. I already said that: 1. This controller and disks succesfully works earlier with FreeBSD 4.6.2 without any problems; 2. I tried to replace a disk with another one (the same model), but it doesn't help. Unfortunately, I have no another free SCSI controller (but see #1); 3. I have another AMD64 machine with different hardware (including disks and SCSI controller) that periodically suffers from the same problem. Unfortunately, that machine is in production and heavily loaded, so I can't overload it even more with INVARIANTS, WITNESS, and DIAGNOSTIC - my clients will not forgive me for that. -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load
Dumpdev is swap partition on da0 (single physical disk) that connected to Mylex AcceleRAID 170 RAID controller. The problem arrives when I copy large amount of files from FTP to another disk (da1) that is connected to the same RAID controller. If the driver or controller is misbehaving it could explain both problems. Any chance you can get another disk in there on a different controller to dump onto? Yes, I got IDE disk and saved kernel dump for another static hang state on it. Here is the dump: ftp://oleg.vsi.ru/private/vmcore.0.zip Is this just the vmcore, or the debugging kernel also? Both are needed to make sense of the dump. Kernel binary with kernel config is here: ftp://oleg.vsi.ru/private/kernel.zip This kernel was built statically, and no modules loaded on boot at all. -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load
) thread 26 [Switching to thread 26 (Thread 100019)]#0 0xc05663ab in sched_switch () (kgdb) bt #0 0xc05663ab in sched_switch () #1 0xc055b868 in mi_switch () #2 0xc0573bd9 in sleepq_switch () #3 0xc0573de2 in sleepq_timedwait () #4 0xc055b269 in msleep () #5 0xc050cfa8 in usb_event_thread () #6 0xc05400cc in fork_exit () #7 0xc06b17cc in fork_trampoline () Anyway, kernel in kernel.zip is exactly the same kernel for which this vmcore generated. I have no any other kernel on this machine :-) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load
Dumpdev is swap partition on da0 (single physical disk) that connected to Mylex AcceleRAID 170 RAID controller. The problem arrives when I copy large amount of files from FTP to another disk (da1) that is connected to the same RAID controller. If the driver or controller is misbehaving it could explain both problems. Any chance you can get another disk in there on a different controller to dump onto? Yes, I got IDE disk and saved kernel dump for another static hang state on it. Here is the dump: ftp://oleg.vsi.ru/private/vmcore.0.zip -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rrdtool performance tuning (fwd)
[hmm, after thinking a bit I decided it would be more appropriate here, in [EMAIL PROTECTED] Dear colleagues, any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror) For example, update algorythm can be changed. Try to not update RRD files simultaneously, queue update data instead (with timestamps), for example, in memory, and periodically do a bulk update using a single rrdupdate call for all queue items related to single RRD file. This saves I/O a lot, for example, in my NMS (TclMon) I use simular scheme, and now it updates 50K RRD files with 5-10 variables each. I/O load is 100-150 tps with 1-1.5MB/s throughput. -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Fw: kern/104406: [ufs] Processes get stuck in ufsstateunderpersistent CPU load
Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. Thanks, I'll try it in the monday morning. I posted a followup to kern/104406 that includes all information listed in Debugging Deadlocks chapter of FreeBSD Developer's Handbook. Can anyone take a look on it and say - is this certainly a hardware problem or some sort of software problem ? Anyone ? :-) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load
Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. Thanks, I'll try it in the monday morning. I posted a followup to kern/104406 that includes all information listed in Debugging Deadlocks chapter of FreeBSD Developer's Handbook. Can anyone take a look on it and say - is this certainly a hardware problem or some sort of software problem ? -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load
After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. Thanks, I'll try it in the monday morning. As far as getting a dump from ddb, try this: ddb call doadump I'm completely at a loss why this isn't a base ddb command dump but whatever... :) Unfortunately, this doesn't work too. I called duty personnel in this datacenter and asked them to do this, and person on duty tells me that after he enters this command something like that arrives on monitor: db call doadump Dumping 3072 MB Dump aborted error I/O Dump failed. (Error 5) Hmnmm, that seems like you might be having a hardware problem, It is possible, but unlikely: 1. I have simular symptoms on another AMD64 machine with 6.2 (uname -a from this machine listed in PR kern/104406 in my followup at Wed, 7 Mar 2007 05:10:59 +0300), but they are rare and this machine is in production, so I can't make experiments with it; 2. All these hardware successfully works earlier with FreeBSD 4.6. what disk device do you have? Dumpdev is swap partition on da0 (single physical disk) that connected to Mylex AcceleRAID 170 RAID controller. The problem arrives when I copy large amount of files from FTP to another disk (da1) that is connected to the same RAID controller. Have you also enabled kernel dumps via /etc/rc.conf:dumpdev= ? Yes, I have dumpdev=AUTO in rc.conf and swap device (4G) listed in /etc/fstab. -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) This very easy to reproduce [ufs] uninterruptable deadlock for both of RELENG_6 and RELENG_7. Look at this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439 The PR is closed but the problem is still here with 7.0-PRERELEASE and, perhaps, CURRENT. This is probably another bug because: 1. I built kernel with INVARIANTS as described in on Debugging Deadlocks page of FreeBSD Developers' Handbook and got no panic, but only deadlock; 2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and NOT occured when the same NTFS mounted r/o). -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. Thanks, I'll try it in the monday morning. As far as getting a dump from ddb, try this: ddb call doadump I'm completely at a loss why this isn't a base ddb command dump but whatever... :) Unfortunately, this doesn't work too. I called duty personnel in this datacenter and asked them to do this, and person on duty tells me that after he enters this command something like that arrives on monitor: db call doadump Dumping 3072 MB Dump aborted error I/O Dump failed. (Error 5) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
Hi all, Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: How to report bugs (Re: 6.2-STABLE deadlock?)
Цитирую Kris Kennaway [EMAIL PROTECTED]: Oleg Derevenetz wrote: ??? LI Xin [EMAIL PROTECTED]: [...] I'm not very sure if this is specific to one disk controller. Actually I got some occasional reports about similar hangs on amd64 6.2-RELEASE (slightly patched version) that most of processes stuck in the 'ufs' state, under very light load, the box was equipped with amr(4) RAID. I was not able to reproduce the problem at my lab, though, it's still unknown that how to trigger the livelock :-( Still need some investigate on their production system. I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= and there should be a thread related to this. Briefly, I suspects that this is related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2- STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, at least). Hmm... Seems to be different issues. The problem I have received was a pgsql server (no nullfs/unionfs involved), and the hang always happen when it is not being heavily loaded (usually in the morning, for instance, and there is no special configuration, like scheduled tasks which can generate disk load, etc., only the entropy harvesting), so this is quite confusing. Yes, a large part of the confusion is the unfortunate tendency of people to do the following: user1 my system hangs/panics/etc user2 my system hangs/panics/etc too; it must be the same problem! What we really need is for every FreeBSD user who encounters a hang/panic/etc to avoid jumping to conclusions -- no matter how many superficial similarities there may seem to you -- and instead go through the relevant steps described here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers- handbook/kerneldebug.html Until you (or a developer) have analyzed the resulting information, you cannot definitively determine whether or not your problem is the same as a given random other problem, and you may just confuse the issue by making claims of similarity when you are really reporting a completely separate problem. Not all people can do deadlock debugging, though. In my case turning on INVARIANTS and WITNESS leads to unacceptable performance penalty due to heavily loaded server. So I can only describe my case, actions and result without providing any debug information. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: How to report bugs (Re: 6.2-STABLE deadlock?)
Цитирую Kris Kennaway [EMAIL PROTECTED]: On Wed, Apr 25, 2007 at 12:14:20PM +0400, Oleg Derevenetz wrote: Until you (or a developer) have analyzed the resulting information, you cannot definitively determine whether or not your problem is the same as a given random other problem, and you may just confuse the issue by making claims of similarity when you are really reporting a completely separate problem. Not all people can do deadlock debugging, though. In my case turning on INVARIANTS and WITNESS leads to unacceptable performance penalty due to heavily loaded server. So I can only describe my case, actions and result without providing any debug information. But you can still do *some* things, e.g. backtraces and/or a coredump: every little bit helps. Ultimately, though, you have to understand and accept that the less information you provide, the less chance there is that a developer will be able to track down your problem. In fact a developer may have to effectively ignore your problem report altogether, because of what I explained about symptoms usually not being enough to tell one bug from another. In general, when you encounter a bug in FreeBSD, you have a little bit of work to do on your side before we can start doing the rest. I understand that you may not be in a position to do that work, but that means you also need to understand that we can't do it either. In fact, I solved (or workarounded) this problem for me, so in this thread I provide my workaround as possible workaround for users that experiences the same problem. This only hint for them, and not a bugreport for you. I could not provide a full (or only partial) debug information because I will not back out cvsuped sources, will not replace unionfs with nullfs again and will not wait week or more for another stuck. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.2-STABLE deadlock?
Цитирую LI Xin [EMAIL PROTECTED]: Kostik Belousov wrote: On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote: On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote: At work, amoungst my stable of old computers running FreeBSD, I have a Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This primarily runs Nagios and a small and lightly used MySQL database, along with a few inbound FTP transfers per minute. It has a Mylex card based disc subsystem, ruling out crash dumps. At some point during 5.5-STABLE this machine started to occasionally hang ... Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics taken rather sooner after the hang. Processes with wmesg=ufs feature often in the ps output. http://www.stade.co.uk/crash1/ I would suspect the mlx controller. There is several processes (for instance, 988, 50918) waiting for completion of block read, and processes in the ufs states are the result of the lock cascade, IMHO. I'm not very sure if this is specific to one disk controller. Actually I got some occasional reports about similar hangs on amd64 6.2-RELEASE (slightly patched version) that most of processes stuck in the 'ufs' state, under very light load, the box was equipped with amr(4) RAID. I was not able to reproduce the problem at my lab, though, it's still unknown that how to trigger the livelock :-( Still need some investigate on their production system. I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= and there should be a thread related to this. Briefly, I suspects that this is related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2- STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, at least). ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Processes get stuck in ufs state
Цитирую Oleg Derevenetz [EMAIL PROTECTED]: On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. See developer handbook, Deadlock Debugging chapter for instruction what information shall be gathered to debug the problem. OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every boot when nullfs mounting in progress: acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040 KDB: stack backtrace: kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78 vrefcnt(cfd5c414) at vrefcnt+0x20 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56 null_lock(f02f1a68) at null_lock+0x66 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at nullfs_root+0x26 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at vfs_domount+0x975 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9 nmount(cfc60300,f02f1d04) at nmount+0x8b syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, ebp = 0xbf7fee38 --- This host have nullfs filesystems. Is this can be related to deadlock ? FYI: after replacing nullfs filesystems with unionfs (using new unionfs implementation): http://people.freebsd.org/~daichi/unionfs/ all deadlocks are gone. It seems to be a problem in current nullfs implementation, but I can't debug it properly because deadlock cases are relatively rare and machine that uses nullfs is heavily loaded so WITNESS and DEBUG options leads to unacceptable performance penalty. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Processes get stuck in ufs state
On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. See developer handbook, Deadlock Debugging chapter for instruction what information shall be gathered to debug the problem. OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every boot when nullfs mounting in progress: acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040 KDB: stack backtrace: kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78 vrefcnt(cfd5c414) at vrefcnt+0x20 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56 null_lock(f02f1a68) at null_lock+0x66 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at nullfs_root+0x26 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at vfs_domount+0x975 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9 nmount(cfc60300,f02f1d04) at nmount+0x8b syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, ebp = 0xbf7fee38 --- This host have nullfs filesystems. Is this can be related to deadlock ? -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Processes get stuck in ufs state
0xfe30-0xfe300fff irq 32 at device 1.0 on pci32 pci33: ACPI PCI bus on pcib5 pci32: base peripheral, interrupt controller at device 1.1 (no driver attached) pcib6: ACPI PCI-PCI bridge mem 0xfe302000-0xfe302fff irq 36 at device 2.0 on pci32 pci34: ACPI PCI bus on pcib6 pci32: base peripheral, interrupt controller at device 2.1 (no driver attached) pcib7: ACPI PCI-PCI bridge mem 0xfe304000-0xfe304fff irq 40 at device 3.0 on pci32 pci35: ACPI PCI bus on pcib7 pci32: base peripheral, interrupt controller at device 3.1 (no driver attached) pcib8: ACPI PCI-PCI bridge mem 0xfe306000-0xfe306fff irq 44 at device 4.0 on pci32 pci36: ACPI PCI bus on pcib8 pci32: base peripheral, interrupt controller at device 4.1 (no driver attached) atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 atkbd0: AT Keyboard irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A fdc0: floppy drive controller port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: 1440-KB 3.5 drive on fdc0 drive 0 pmtimer0 on isa0 orm0: ISA Option ROMs at iomem 0xc-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcefff on isa0 ppc0: parallel port not found. sc0: System console at flags 0x100 on isa0 sc0: VGA 16 virtual consoles, flags=0x300 sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0 Timecounters tick every 1.000 msec IP Filter: v4.1.13 initialized. Default = block all, Logging = enabled Waiting 5 seconds for SCSI devices to settle acd0: DMA limited to UDMA33, controller found non-ATA66 cable acd0: DVDROM MATSHITADVD-ROM SR-8178/PZ21 at ata1-master UDMA33 ses0 at mpt0 bus 0 target 6 lun 0 ses0: SDR GEM318P 1 Fixed Processor SCSI-2 device ses0: 3.300MB/s transfers ses0: SAF-TE Compliant Device SMP: AP CPU #1 Launched! da0 at mpt0 bus 0 target 0 lun 0 da0: SEAGATE ST373207LC 0003 Fixed Direct Access SCSI-3 device da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 70007MB (143374744 512 byte sectors: 255H 63S/T 8924C) da1 at mpt0 bus 0 target 2 lun 0 da1: SEAGATE ST336807LC 0C01 Fixed Direct Access SCSI-3 device da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) Trying to mount root from ufs:/dev/da0s1a Accounting enabled Recently I posted followup to this PR with description of the problem. Any ideas on how to debug this ? -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Processes get stuck in ufs state
0xfe30-0xfe300fff irq 32 at device 1.0 on pci32 pci33: ACPI PCI bus on pcib5 pci32: base peripheral, interrupt controller at device 1.1 (no driver attached) pcib6: ACPI PCI-PCI bridge mem 0xfe302000-0xfe302fff irq 36 at device 2.0 on pci32 pci34: ACPI PCI bus on pcib6 pci32: base peripheral, interrupt controller at device 2.1 (no driver attached) pcib7: ACPI PCI-PCI bridge mem 0xfe304000-0xfe304fff irq 40 at device 3.0 on pci32 pci35: ACPI PCI bus on pcib7 pci32: base peripheral, interrupt controller at device 3.1 (no driver attached) pcib8: ACPI PCI-PCI bridge mem 0xfe306000-0xfe306fff irq 44 at device 4.0 on pci32 pci36: ACPI PCI bus on pcib8 pci32: base peripheral, interrupt controller at device 4.1 (no driver attached) atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 atkbd0: AT Keyboard irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A fdc0: floppy drive controller port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: 1440-KB 3.5 drive on fdc0 drive 0 pmtimer0 on isa0 orm0: ISA Option ROMs at iomem 0xc-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcefff on isa0 ppc0: parallel port not found. sc0: System console at flags 0x100 on isa0 sc0: VGA 16 virtual consoles, flags=0x300 sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0 Timecounters tick every 1.000 msec IP Filter: v4.1.13 initialized. Default = block all, Logging = enabled Waiting 5 seconds for SCSI devices to settle acd0: DMA limited to UDMA33, controller found non-ATA66 cable acd0: DVDROM MATSHITADVD-ROM SR-8178/PZ21 at ata1-master UDMA33 ses0 at mpt0 bus 0 target 6 lun 0 ses0: SDR GEM318P 1 Fixed Processor SCSI-2 device ses0: 3.300MB/s transfers ses0: SAF-TE Compliant Device SMP: AP CPU #1 Launched! da0 at mpt0 bus 0 target 0 lun 0 da0: SEAGATE ST373207LC 0003 Fixed Direct Access SCSI-3 device da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 70007MB (143374744 512 byte sectors: 255H 63S/T 8924C) da1 at mpt0 bus 0 target 2 lun 0 da1: SEAGATE ST336807LC 0C01 Fixed Direct Access SCSI-3 device da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) Trying to mount root from ufs:/dev/da0s1a Accounting enabled Recently I posted followup to this PR with description of the problem. Any ideas on how to debug this ? -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]