On Mon Nov 15 23:23:12 EST 2010, lu...@proxima.alt.za wrote: > Regarding the "deadlock" report that I occasionally see on my CPU > server console, I won't bore anyone with PC addresses or anything like > that, but I will recommend something I believe to be a possible > trigger: the failure always seems to occur within "exportfs", which in > this case is used exclusively to run stats(1) remotely from my > workstation. So the recommendation is that somebody like Erik, who is > infinitely more clued up than I am in the kernel arcana should run one > or more stats sessions into a cpu server (I happen to be running > fossil, so maybe Erik won't see this) and see if he can also trigger this > behaviour. I'm hoping that it is not platform specific. > > Right now, I'm short of skills as well as a serial console :-(
i run stats all the time. i've never seen a lock loop caused by stats. exportfs gets blamed all the time for the sins of others. possible culprits are the tcp/ip stack and the kernel devices that stats accesses and of course, the channel code itself. it would be a good idea for you to track down all the pcs involved and send them along. i can't think of another way of narrowing down the list of potential suspects. not all of our usual suspects has an alibi. i assume you've fixed this? (not yet fixed on sources.) /n/sources/plan9//sys/src/9/port/chan.c:1012,1018 - chan.c:1012,1020 /* * mh->mount->to == c, so start at mh->mount->next */ + f = nil; rlock(&mh->lock); + if(mh->mount) for(f = mh->mount->next; f; f = f->next) if((wq = ewalk(f->to, nil, names+nhave, ntry)) != nil) break; - erik