Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, On 12/27/05, Nikolay Pavlov <[EMAIL PROTECTED]> wrote: [snip] > Seems i have the same problem. > > ad1: req=0xc2998000 SETFEATURES SET TRANSFER MODE semaphore timeout !! > DANGER Will Robinson !! > swap_pager: indefinite bufobj: 0 blkno: 3 size: 4096 This looks like a driver problem, or hardware issue... Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
On Saturday, 17 December 2005 at 0:49:48 +0800, Xin LI wrote: > Dear folks, > > I have a box indicating the following sometimes: > "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096" > > It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks > attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled. > > Does this indicate a hardware issue, or some bugs elsewhere? > > Cheers, > -- > Xin LI <[EMAIL PROTECTED]> http://www.delphij.net Seems i have the same problem. ad1: req=0xc2998000 SETFEATURES SET TRANSFER MODE semaphore timeout !! DANGER Will Robinson !! swap_pager: indefinite bufobj: 0 blkno: 3 size: 4096 then it's just halt. But not everytime, sometimes it's booting without any problems and i don't know why.. Here is additional information: == # uname -a FreeBSD spirit 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Fri Nov 11 17:00:05 UTC 2005 [EMAIL PROTECTED]:/usr/obj/usr/s rc/sys/SPIRIT i386 ATA channel 0: Master: no device present Slave: ad1 ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "[EMAIL PROTECTED]" -- = Best regards, Nikolay Pavlov. <<<--- = ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, On 12/23/05, Douglas K. Rand <[EMAIL PROTECTED]> wrote: > Xin> Which scheduler are you using? > > SCHED_4BSD. Thanks, I have did some experiments (with ULE scheduler) to try to trigger the issue with no luck, either :-( Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Xin> Hi Hi. On 19 Dec 2005 14:32:31 -0600, Douglas K. Rand <[EMAIL PROTECTED]> wrote: Doug> Tracing command swapper pid 0 tid 0 td 0xc0698e20 Doug> sched_switch(c0698e20,0,1) at sched_switch+0x14b Doug> mi_switch(1,0) at mi_switch+0x1ba Doug> scheduler(0,81ec00,81e000,0,c042f5d5) at scheduler+0x262 Doug> mi_startup() at mi_startup+0x96 Doug> begin() at begin+0x2c Xin> Which scheduler are you using? SCHED_4BSD. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, On 19 Dec 2005 14:32:31 -0600, Douglas K. Rand <[EMAIL PROTECTED]> wrote: [...] > Tracing command swapper pid 0 tid 0 td 0xc0698e20 > sched_switch(c0698e20,0,1) at sched_switch+0x14b > mi_switch(1,0) at mi_switch+0x1ba > scheduler(0,81ec00,81e000,0,c042f5d5) at scheduler+0x262 > mi_startup() at mi_startup+0x96 > begin() at begin+0x2c Which scheduler are you using? Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
** On Sat, 17 Dec 2005 00:49:48 +0800, Xin LI <[EMAIL PROTECTED]> said: Xin> I have a box indicating the following sometimes: Xin> "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096" We are having a very similar problem that we've been trying to diagnose off and on for a while. About 20% of the time the system will emit very similar messages. In each case processes trying to write to disks get stuck in the "ufs" state. This is a dual CPU AMD Athlon(tm) MP 1600+ system on a Tyan 2460 mobo with 512 MB of RAM and an ICP Vortex GDT8546RZ SATA RAID controller. Here is the result of a C-t on an interactive shell via a serial console: load: 0.00 cmd: csh 547 [ufs] 0.02u 0.01s 0% 2516k With the kernel debugger a few processes will be stuck in the same state: db> ps pid proc uid ppid pgrp flag stat wmesgwchan cmd 647 c1a080000 645 647 110 [SLPQ ufs 0xc1856c08][SLP] cron 646 c18834182 644 646 110 [SLPQ biord 0xcbcff650][SLP] cron 645 c1a0a8300 493 493 000 [SLPQ ppwait 0xc1a0a830][SLP] cron 644 c1887c480 493 493 000 [SLPQ ppwait 0xc1887c48][SLP] cron 643 c1a0a4180 632 642 0004002 [SLPQ wdrain 0xc06a26c4][SLP] bsdtar 632 c1a08a3c0 631 632 0004002 [SLPQ pause 0xc1a08a70][SLP] tcsh 631 c1834624 1001 616 631 0004102 [SLPQ wait 0xc1834624][SLP] su 616 c188720c 1001 615 616 0004002 [SLPQ pause 0xc1887240][SLP] tcsh 615 c1a0a624 1001 613 613 100 [SLPQ select 0xc06a2144][SLP] sshd 613 c18836240 471 613 0004100 [SLPQ sbwait 0xc1b3dbc8][SLP] sshd 547 c1671a3c0 537 547 0004002 [SLPQ ufs 0xc1856c08][SLP] csh 537 c1883a3c0 1 537 0004102 [SLPQ wait 0xc1883a3c][SLP] login 536 c18876240 1 536 0004002 [SLPQ ttyin 0xc16e8810][SLP] getty 535 c16716240 1 535 0004002 [SLPQ ttyin 0xc16e8410][SLP] getty 534 c1a0aa3c0 1 534 0004002 [SLPQ ttyin 0xc16dac10][SLP] getty 533 c1671c480 1 533 0004002 [SLPQ ttyin 0xc16e7010][SLP] getty 532 c1a0ac480 1 532 0004002 [SLPQ ttyin 0xc16e7410][SLP] getty 531 c1887a3c0 1 531 0004002 [SLPQ ttyin 0xc16e7810][SLP] getty 530 c18870000 1 530 0004002 [SLPQ ttyin 0xc16e7c10][SLP] getty 529 c183420c0 1 529 0004002 [SLPQ ttyin 0xc16e8010][SLP] getty 513 c1a0a20c0 1 513 000 [SLPQ select 0xc06a2144][SLP] inetd 493 c1a088300 1 493 000 [SLPQ ufs 0xc1856c08][SLP] cron 481 c1883c48 25 1 481 100 [SLPQ pause 0xc1883c7c][SLP] sendmail 477 c167120c0 1 477 100 [SLPQ pause 0xc1671240][SLP] sendmail 471 c18340000 1 471 100 [SLPQ select 0xc06a2144][SLP] sshd 349 c18874180 1 349 000 [SLPQ ufs 0xc1856c08][SLP] ypbind 336 c18344180 1 336 000 [SLPQ select 0xc06a2144][SLP] rpcbind 317 c18878300 1 317 000 [SLPQ select 0xc06a2144][SLP] syslogd 283 c16718300 1 283 000 [SLPQ select 0xc06a2144][SLP] devd 228 c1883000 65 1 228 100 [SLPQ select 0xc06a2144][SLP] dhclient 208 c188320c0 153 002 [SLPQ select 0xc06a2144][SLP] dhclient 52 c1834a3c0 0 0 204 [SLPQ - 0xd5526d08][SLP] schedcpu 51 c1834c480 0 0 204 [SLPQ - 0xc06aa4cc][SLP] nfsiod 3 50 c161b6240 0 0 204 [SLPQ - 0xc06aa4c8][SLP] nfsiod 2 49 c161b8300 0 0 204 [SLPQ - 0xc06aa4c4][SLP] nfsiod 1 48 c161ba3c0 0 0 204 [SLPQ - 0xc06aa4c0][SLP] nfsiod 0 47 c161bc480 0 0 204 [SLPQ vlruwt 0xc161bc48][SLP] vnlru 46 c1670 0 0 204 [SLPQ getblk 0xcbc36578][SLP] syncer 45 c167020c0 0 0 204 [SLPQ psleep 0xc06a268c][SLP] bufdaemon 44 c16704180 0 0 20c [SLPQ pgzero 0xc06b09c4][SLP] pagezero 9 c16706240 0 0 204 [SLPQ psleep 0xc06b0514][SLP] vmdaemon 8 c16708300 0 0 204 [SLPQ psleep 0xc06b04d0][SLP] pagedaemon 43 c1670a3c0 0 0 204 [IWAIT] swi0: sio 7 c1670c480 0 0 204 [SLPQ - 0xc16dd43c][SLP] fdc0 6 c16710000 0 0 204 [SLPQ - 0xc166f280][SLP] kqueue taskq 42 c160fc480 0 0 204 [IWAIT] swi5:+ 5 c161a0000 0 0 204 [SLPQ - 0xc15c1280][SLP] thread taskq 41 c161a20c0 0 0 204 [IWAIT] swi6:+ 40 c161a4180 0 0 204 [IWAIT] swi6: task queue 39 c161a6240 0 0 204 [IWAIT] swi2: cambio 38 c161a8300 0 0 204 [SLPQ - 0xc0698140][SLP] yarrow 4 c161aa3c0 0 0 204 [SLPQ - 0xc0698b08][SLP] g_down 3 c161ac480 0 0 204 [SLPQ - 0xc0698b04][SLP] g_up 2 c161b0000 0 0 204 [SLPQ - 0xc0698afc][SLP] g_event 37 c161b20c0 0 0 204 [IWAIT] swi3: vm 36 c161b4180 0 0 20c [IWAIT] swi4: clock sio 35 c16006240 0 0
Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, Peter, On 12/17/05, Peter Jeremy <[EMAIL PROTECTED]> wrote: > On Sat, 2005-Dec-17 04:06:36 +0800, Xin LI wrote: > >No, it's sometimes other, and is quite infrequent. On the other hand, > >neither SMART nor error has reported some incident, so I was stuck > >when looking on hardware issues, as the message does not indicate > >which disk(s) may have problem... > > A hardware error should have been reported as such. But if you > suspect a disk problem, try dd'ing the swap partition (or the whole > disk) to /dev/null. If you can read the whole partition, you can > probably write to it. (Or you could dd /dev/zero to the partitions > whilst swap is not attached - eg in single user after boot). If you > suspect retries are a problem, monitor the I/O rate with iostat or > systat and see if it suddenly drops. Actually I already tried to dd with no luck... I suspect that the cable may have some problems, though, as a hard disk is working at half speed of the other one (say, it's not a sudden drop)... But will this cause the warning? Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
On Sat, 2005-Dec-17 04:06:36 +0800, Xin LI wrote: >No, it's sometimes other, and is quite infrequent. On the other hand, >neither SMART nor error has reported some incident, so I was stuck >when looking on hardware issues, as the message does not indicate >which disk(s) may have problem... A hardware error should have been reported as such. But if you suspect a disk problem, try dd'ing the swap partition (or the whole disk) to /dev/null. If you can read the whole partition, you can probably write to it. (Or you could dd /dev/zero to the partitions whilst swap is not attached - eg in single user after boot). If you suspect retries are a problem, monitor the I/O rate with iostat or systat and see if it suddenly drops. -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, Kris, On 12/17/05, Kris Kennaway <[EMAIL PROTECTED]> wrote: [...] > In that configuration, probably a hardware issue. Is it always the > same block? No, it's sometimes other, and is quite infrequent. On the other hand, neither SMART nor error has reported some incident, so I was stuck when looking on hardware issues, as the message does not indicate which disk(s) may have problem... Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
On Sat, Dec 17, 2005 at 12:49:48AM +0800, Xin LI wrote: > Dear folks, > > I have a box indicating the following sometimes: > "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096" > > It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks > attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled. > > Does this indicate a hardware issue, or some bugs elsewhere? In that configuration, probably a hardware issue. Is it always the same block? This message triggers as a false positive in other cases when swapping is too slow, like if you swap to a swapfile on a heavily loaded filesystem. Kris pgpOmvv3SXeas.pgp Description: PGP signature
Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, Scott, On 12/17/05, Scott Long <[EMAIL PROTECTED]> wrote: [...] > It means that either the hardware or driver lost a transaction, or that > some sort LOR-type situation happened in the VM that is preventing the > transaction from completing. First of all, do you have more than 4GB of > RAM? No, actually it has only 1GB of memory... BTW. Wouldn't a LOR cause deadlock if it is preventing the transaction from completing? Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Hi, On 12/17/05, Stanislaw Halik <[EMAIL PROTECTED]> wrote: > Xin LI <[EMAIL PROTECTED]> wrote: > > I have a box indicating the following sometimes: > > "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096" > > Does this indicate a hardware issue, or some bugs elsewhere? > > are you using swapfiles? swap memory through files's performance is kind > of lower than in swap partitions and it might be the cause. i've got > same messages from a server with swap file, but nothing which would > affect stability. No. I used two swap partitions that is on two disks... Cheers, -- Xin LI <[EMAIL PROTECTED]> http://www.delphij.net ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Xin LI wrote: Dear folks, I have a box indicating the following sometimes: "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096" It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled. Does this indicate a hardware issue, or some bugs elsewhere? Cheers, -- It means that either the hardware or driver lost a transaction, or that some sort LOR-type situation happened in the VM that is preventing the transaction from completing. First of all, do you have more than 4GB of RAM? Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: indefinite wait buffer: Does this indicate hardware issue?
Xin LI <[EMAIL PROTECTED]> wrote: > I have a box indicating the following sometimes: > "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096" > It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks > attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled. > Does this indicate a hardware issue, or some bugs elsewhere? are you using swapfiles? swap memory through files's performance is kind of lower than in swap partitions and it might be the cause. i've got same messages from a server with swap file, but nothing which would affect stability. -- Stanisław Halik, http://tehran.lain.pl pgpFcOPhEX3BH.pgp Description: PGP signature