Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-26 Thread Xin LI
Hi,

On 12/27/05, Nikolay Pavlov <[EMAIL PROTECTED]> wrote:
[snip]
> Seems i have the same problem.
>
>  ad1: req=0xc2998000 SETFEATURES SET TRANSFER MODE semaphore timeout !! 
> DANGER Will Robinson !!
>  swap_pager: indefinite bufobj: 0 blkno: 3 size: 4096

This looks like a driver problem, or hardware issue...

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-26 Thread Nikolay Pavlov
On Saturday, 17 December 2005 at  0:49:48 +0800, Xin LI wrote:
> Dear folks,
> 
> I have a box indicating the following sometimes:
> "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096"
> 
> It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks
> attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled.
> 
> Does this indicate a hardware issue, or some bugs elsewhere?
> 
> Cheers,
> --
> Xin LI <[EMAIL PROTECTED]> http://www.delphij.net

Seems i have the same problem.

 ad1: req=0xc2998000 SETFEATURES SET TRANSFER MODE semaphore timeout !! DANGER 
Will Robinson !!
 swap_pager: indefinite bufobj: 0 blkno: 3 size: 4096

 then it's just halt.

 But not everytime, sometimes it's booting without any problems and i don't 
know why..

 Here is additional information:

 
==
 # uname -a
 FreeBSD spirit 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Fri Nov 11 17:00:05 UTC 
2005 [EMAIL PROTECTED]:/usr/obj/usr/s
 rc/sys/SPIRIT  i386
 

 ATA channel 0:
 Master:  no device present
 Slave:   ad1  ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"

-- 

= Best regards, Nikolay Pavlov. <<<--- =

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-24 Thread Xin LI
Hi,

On 12/23/05, Douglas K. Rand <[EMAIL PROTECTED]> wrote:
> Xin> Which scheduler are you using?
>
> SCHED_4BSD.

Thanks, I have did some experiments (with ULE scheduler) to try to
trigger the issue with no luck, either :-(

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-23 Thread Douglas K. Rand
Xin> Hi

Hi.

On 19 Dec 2005 14:32:31 -0600, Douglas K. Rand <[EMAIL PROTECTED]> wrote:
Doug> Tracing command swapper pid 0 tid 0 td 0xc0698e20
Doug> sched_switch(c0698e20,0,1) at sched_switch+0x14b
Doug> mi_switch(1,0) at mi_switch+0x1ba
Doug> scheduler(0,81ec00,81e000,0,c042f5d5) at scheduler+0x262
Doug> mi_startup() at mi_startup+0x96
Doug> begin() at begin+0x2c

Xin> Which scheduler are you using?

SCHED_4BSD.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-22 Thread Xin LI
Hi,

On 19 Dec 2005 14:32:31 -0600, Douglas K. Rand <[EMAIL PROTECTED]> wrote:
[...]
> Tracing command swapper pid 0 tid 0 td 0xc0698e20
> sched_switch(c0698e20,0,1) at sched_switch+0x14b
> mi_switch(1,0) at mi_switch+0x1ba
> scheduler(0,81ec00,81e000,0,c042f5d5) at scheduler+0x262
> mi_startup() at mi_startup+0x96
> begin() at begin+0x2c

Which scheduler are you using?

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-19 Thread Douglas K. Rand
** On Sat, 17 Dec 2005 00:49:48 +0800, Xin LI <[EMAIL PROTECTED]> said:

Xin> I have a box indicating the following sometimes:
Xin> "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096"

We are having a very similar problem that we've been trying to
diagnose off and on for a while. About 20% of the time the system will
emit very similar messages. In each case processes trying to write to
disks get stuck in the "ufs" state.

This is a dual CPU AMD Athlon(tm) MP 1600+ system on a Tyan 2460 mobo
with 512 MB of RAM and an ICP Vortex GDT8546RZ SATA RAID controller.

Here is the result of a C-t on an interactive shell via a serial
console:

  load: 0.00  cmd: csh 547 [ufs] 0.02u 0.01s 0% 2516k

With the kernel debugger a few processes will be stuck in the same state:

db> ps
  pid   proc uid  ppid  pgrp  flag   stat  wmesgwchan  cmd
  647 c1a080000   645   647 110 [SLPQ ufs 0xc1856c08][SLP] cron
  646 c18834182   644   646 110 [SLPQ biord 0xcbcff650][SLP] cron
  645 c1a0a8300   493   493 000 [SLPQ ppwait 0xc1a0a830][SLP] cron
  644 c1887c480   493   493 000 [SLPQ ppwait 0xc1887c48][SLP] cron
  643 c1a0a4180   632   642 0004002 [SLPQ wdrain 0xc06a26c4][SLP] bsdtar
  632 c1a08a3c0   631   632 0004002 [SLPQ pause 0xc1a08a70][SLP] tcsh
  631 c1834624 1001   616   631 0004102 [SLPQ wait 0xc1834624][SLP] su
  616 c188720c 1001   615   616 0004002 [SLPQ pause 0xc1887240][SLP] tcsh
  615 c1a0a624 1001   613   613 100 [SLPQ select 0xc06a2144][SLP] sshd
  613 c18836240   471   613 0004100 [SLPQ sbwait 0xc1b3dbc8][SLP] sshd
  547 c1671a3c0   537   547 0004002 [SLPQ ufs 0xc1856c08][SLP] csh
  537 c1883a3c0 1   537 0004102 [SLPQ wait 0xc1883a3c][SLP] login
  536 c18876240 1   536 0004002 [SLPQ ttyin 0xc16e8810][SLP] getty
  535 c16716240 1   535 0004002 [SLPQ ttyin 0xc16e8410][SLP] getty
  534 c1a0aa3c0 1   534 0004002 [SLPQ ttyin 0xc16dac10][SLP] getty
  533 c1671c480 1   533 0004002 [SLPQ ttyin 0xc16e7010][SLP] getty
  532 c1a0ac480 1   532 0004002 [SLPQ ttyin 0xc16e7410][SLP] getty
  531 c1887a3c0 1   531 0004002 [SLPQ ttyin 0xc16e7810][SLP] getty
  530 c18870000 1   530 0004002 [SLPQ ttyin 0xc16e7c10][SLP] getty
  529 c183420c0 1   529 0004002 [SLPQ ttyin 0xc16e8010][SLP] getty
  513 c1a0a20c0 1   513 000 [SLPQ select 0xc06a2144][SLP] inetd
  493 c1a088300 1   493 000 [SLPQ ufs 0xc1856c08][SLP] cron
  481 c1883c48   25 1   481 100 [SLPQ pause 0xc1883c7c][SLP] sendmail
  477 c167120c0 1   477 100 [SLPQ pause 0xc1671240][SLP] sendmail
  471 c18340000 1   471 100 [SLPQ select 0xc06a2144][SLP] sshd
  349 c18874180 1   349 000 [SLPQ ufs 0xc1856c08][SLP] ypbind
  336 c18344180 1   336 000 [SLPQ select 0xc06a2144][SLP] rpcbind
  317 c18878300 1   317 000 [SLPQ select 0xc06a2144][SLP] syslogd
  283 c16718300 1   283 000 [SLPQ select 0xc06a2144][SLP] devd
  228 c1883000   65 1   228 100 [SLPQ select 0xc06a2144][SLP] dhclient
  208 c188320c0 153 002 [SLPQ select 0xc06a2144][SLP] dhclient
   52 c1834a3c0 0 0 204 [SLPQ - 0xd5526d08][SLP] schedcpu
   51 c1834c480 0 0 204 [SLPQ - 0xc06aa4cc][SLP] nfsiod 3
   50 c161b6240 0 0 204 [SLPQ - 0xc06aa4c8][SLP] nfsiod 2
   49 c161b8300 0 0 204 [SLPQ - 0xc06aa4c4][SLP] nfsiod 1
   48 c161ba3c0 0 0 204 [SLPQ - 0xc06aa4c0][SLP] nfsiod 0
   47 c161bc480 0 0 204 [SLPQ vlruwt 0xc161bc48][SLP] vnlru
   46 c1670 0 0 204 [SLPQ getblk 0xcbc36578][SLP] syncer
   45 c167020c0 0 0 204 [SLPQ psleep 0xc06a268c][SLP] bufdaemon
   44 c16704180 0 0 20c [SLPQ pgzero 0xc06b09c4][SLP] pagezero
9 c16706240 0 0 204 [SLPQ psleep 0xc06b0514][SLP] vmdaemon
8 c16708300 0 0 204 [SLPQ psleep 0xc06b04d0][SLP] pagedaemon
   43 c1670a3c0 0 0 204 [IWAIT] swi0: sio
7 c1670c480 0 0 204 [SLPQ - 0xc16dd43c][SLP] fdc0
6 c16710000 0 0 204 [SLPQ - 0xc166f280][SLP] kqueue taskq
   42 c160fc480 0 0 204 [IWAIT] swi5:+
5 c161a0000 0 0 204 [SLPQ - 0xc15c1280][SLP] thread taskq
   41 c161a20c0 0 0 204 [IWAIT] swi6:+
   40 c161a4180 0 0 204 [IWAIT] swi6: task queue
   39 c161a6240 0 0 204 [IWAIT] swi2: cambio
   38 c161a8300 0 0 204 [SLPQ - 0xc0698140][SLP] yarrow
4 c161aa3c0 0 0 204 [SLPQ - 0xc0698b08][SLP] g_down
3 c161ac480 0 0 204 [SLPQ - 0xc0698b04][SLP] g_up
2 c161b0000 0 0 204 [SLPQ - 0xc0698afc][SLP] g_event
   37 c161b20c0 0 0 204 [IWAIT] swi3: vm
   36 c161b4180 0 0 20c [IWAIT] swi4: clock sio
   35 c16006240 0 0 

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-17 Thread Xin LI
Hi, Peter,

On 12/17/05, Peter Jeremy <[EMAIL PROTECTED]> wrote:
> On Sat, 2005-Dec-17 04:06:36 +0800, Xin LI wrote:
> >No, it's sometimes other, and is quite infrequent.  On the other hand,
> >neither SMART nor error has reported some incident, so I was stuck
> >when looking on hardware issues, as the message does not indicate
> >which disk(s) may have problem...
>
> A hardware error should have been reported as such.  But if you
> suspect a disk problem, try dd'ing the swap partition (or the whole
> disk) to /dev/null.  If you can read the whole partition, you can
> probably write to it.  (Or you could dd /dev/zero to the partitions
> whilst swap is not attached - eg in single user after boot).  If you
> suspect retries are a problem, monitor the I/O rate with iostat or
> systat and see if it suddenly drops.

Actually I already tried to dd with no luck...  I suspect that the
cable may have some problems, though, as a hard disk is working at
half speed of the other one (say, it's not a sudden drop)...  But will
this cause the warning?

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Peter Jeremy
On Sat, 2005-Dec-17 04:06:36 +0800, Xin LI wrote:
>No, it's sometimes other, and is quite infrequent.  On the other hand,
>neither SMART nor error has reported some incident, so I was stuck
>when looking on hardware issues, as the message does not indicate
>which disk(s) may have problem...

A hardware error should have been reported as such.  But if you
suspect a disk problem, try dd'ing the swap partition (or the whole
disk) to /dev/null.  If you can read the whole partition, you can
probably write to it.  (Or you could dd /dev/zero to the partitions
whilst swap is not attached - eg in single user after boot).  If you
suspect retries are a problem, monitor the I/O rate with iostat or
systat and see if it suddenly drops.

-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Xin LI
Hi, Kris,

On 12/17/05, Kris Kennaway <[EMAIL PROTECTED]> wrote:
[...]
> In that configuration, probably a hardware issue.  Is it always the
> same block?

No, it's sometimes other, and is quite infrequent.  On the other hand,
neither SMART nor error has reported some incident, so I was stuck
when looking on hardware issues, as the message does not indicate
which disk(s) may have problem...

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Kris Kennaway
On Sat, Dec 17, 2005 at 12:49:48AM +0800, Xin LI wrote:
> Dear folks,
> 
> I have a box indicating the following sometimes:
> "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096"
> 
> It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks
> attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled.
> 
> Does this indicate a hardware issue, or some bugs elsewhere?

In that configuration, probably a hardware issue.  Is it always the
same block?

This message triggers as a false positive in other cases when swapping
is too slow, like if you swap to a swapfile on a heavily loaded
filesystem.

Kris


pgpOmvv3SXeas.pgp
Description: PGP signature


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Xin LI
Hi, Scott,

On 12/17/05, Scott Long <[EMAIL PROTECTED]> wrote:
[...]
> It means that either the hardware or driver lost a transaction, or that
> some sort LOR-type situation happened in the VM that is preventing the
> transaction from completing.  First of all, do you have more than 4GB of
> RAM?

No, actually it has only 1GB of memory...  BTW. Wouldn't a LOR cause
deadlock if it is preventing the transaction from completing?

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Xin LI
Hi,

On 12/17/05, Stanislaw Halik <[EMAIL PROTECTED]> wrote:
> Xin LI <[EMAIL PROTECTED]> wrote:
> > I have a box indicating the following sometimes:
> > "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096"
> > Does this indicate a hardware issue, or some bugs elsewhere?
>
> are you using swapfiles? swap memory through files's performance is kind
> of lower than in swap partitions and it might be the cause. i've got
> same messages from a server with swap file, but nothing which would
> affect stability.

No.  I used two swap partitions that is on two disks...

Cheers,
--
Xin LI <[EMAIL PROTECTED]> http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Scott Long

Xin LI wrote:

Dear folks,

I have a box indicating the following sometimes:
"swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096"

It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks
attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled.

Does this indicate a hardware issue, or some bugs elsewhere?

Cheers,
--


It means that either the hardware or driver lost a transaction, or that
some sort LOR-type situation happened in the VM that is preventing the
transaction from completing.  First of all, do you have more than 4GB of
RAM?

Scott
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: indefinite wait buffer: Does this indicate hardware issue?

2005-12-16 Thread Stanislaw Halik
Xin LI <[EMAIL PROTECTED]> wrote:
> I have a box indicating the following sometimes:
> "swap_pager: indefinite wait buffer: bufobj: 0, blkno: 262169, size: 4096"

> It's running FreeBSD 6.0-RELEASE, with two Maxtor 7Y250P0 hard disks
> attached to Intel ICH5 UDMA100 controller with hw.ata.wc disabled.

> Does this indicate a hardware issue, or some bugs elsewhere?

are you using swapfiles? swap memory through files's performance is kind
of lower than in swap partitions and it might be the cause. i've got
same messages from a server with swap file, but nothing which would
affect stability.

-- 
Stanisław Halik, http://tehran.lain.pl


pgpFcOPhEX3BH.pgp
Description: PGP signature