On Thu, Nov 13, 2008 at 12:26:42PM +0200, Kostik Belousov wrote:
> On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
> > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote:
> > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote:
> > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
> > > > > I've been playing around with snapshots lately but I've got a problem 
> > > > > on
> > > > > one of my servers running 7-STABLE amd64:
> > > > > 
> > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 
> > > > > 20:49:51 GMT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/PALADIN  
> > > > > amd64
> > > > > 
> > > > > I run the mksnap_ffs command to take the snapshot and some time later
> > > > > the system completely freezes up:
> > > > > 
> > > > > paladin# cd /u2/.snap/
> > > > > paladin# mksnap_ffs /u2 test.1
> > > > > 
> > > > > It only happens on this one filesystem, though, which might be to do
> > > > > with its size. It's not over the 2TB marker, but it's pretty close. 
> > > > > It's
> > > > > also backed by a hardware RAID system, although a smaller filesystem 
> > > > > on
> > > > > the same RAID has no issues.
> > > > > 
> > > > > Filesystem  1K-blocks       Used     Avail Capacity  Mounted on
> > > > > /dev/da0s1a 2078881084 921821396 990749202    48%    /u2
> > > > > 
> > > > > To clarify "completely freezes up": unresponsive to all services over
> > > > > the network, except ping. On the console I can switch between the 
> > > > > ttys,
> > > > > but none of them respond. The only way out is to hit the reset button.
> > > > 
> > > > You need to provide information described in the
> > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
> > > > and especially
> > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
> > > 
> > > Ok, I've done that, and removed the patch that seemed to fix things.
> > > 
> > > The first thing I notice after doing this on the console is that I can
> > > still ctrl+t the process:
> > > 
> > > load: 0.14  cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k
> > > 
> > > But the top and ps I left running on other ttys have all stopped
> > > responding.
> > 
> > Then in my book, the patch didn't fix anything.  :-)  The system is
> > still "deadlocking"; snapshot generation **should not** wedge the system
> > hard like this.
> You systematically mix two completely different issues:
> - first one is the _deadlock_ experienced by Tim;

Re-read what he wrote.  Quote:

"Ok, I've done that, and removed the patch that seemed to fix things.

The first thing I notice after doing this on the console is that I can
still ctrl+t the process:

load: 0.14  cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k

But the top and ps I left running on other ttys have all stopped
responding."

If he can press Control-T, it means SIGINFO can be sent to the
mksnap_ffs process, and the process responds with that information.  So,
the system is not deadlocked -- meaning, I believe what he experiences
is what others experience (the system becomes completely unusable during
mksnap_ffs running, but DOES NOT hang or lock up, it just becomes so
god-awful slow that processes on the machine literally sit and spin for
minutes at a time).

> - second one is the slowdown during snapshot creation.
> In fact, I may count third, where dump itself hangs, as a usermode process,
> but kernel still normally operates.
> 
> Patch posted should fix or paper over the first issue for practical means.
> Third issue most likely fixed by the subr_sleepqueue race fix.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to