Re: fsck_ufs locked in snaplk

2006-04-30 Thread Dmitry Morozovsky
On Sat, 29 Apr 2006, Dmitry Morozovsky wrote: DM KK I'll try to build DDB kernel tomorrow evening to check. Which commands should I DM KK issue in ddb ? DM KK DM KK 'show lockedvnods', 'ps' and 'alltrace' are important. DM DM Well, common usage pattern does not lead to lock today. I made

Re: fsck_ufs locked in snaplk

2006-04-29 Thread Dmitry Morozovsky
On Mon, 24 Apr 2006, Kris Kennaway wrote: KK I'll try to build DDB kernel tomorrow evening to check. Which commands should I KK issue in ddb ? KK KK 'show lockedvnods', 'ps' and 'alltrace' are important. Well, common usage pattern does not lead to lock today. I made some snapshots on

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Dmitry Morozovsky
On Tue, 25 Apr 2006, Kris Kennaway wrote: KK What people are seeing now must be some other problem that I wan't KK able to reproduce. KK KK Once I hear back from someone who can reproduce it with debugging KK enabled (I'm also trying) we can try to fix it. Please try to simulate user who is

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Kostik Belousov
On Wed, Apr 26, 2006 at 01:43:42PM +0400, Dmitry Morozovsky wrote: On Tue, 25 Apr 2006, Kris Kennaway wrote: KK What people are seeing now must be some other problem that I wan't KK able to reproduce. KK KK Once I hear back from someone who can reproduce it with debugging KK enabled (I'm

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Chris Dillon
Sorry Dmitry, you'll get this again since I forgot to reply to the list the first time. Quoting Dmitry Morozovsky [EMAIL PROTECTED]: On Tue, 25 Apr 2006, Kris Kennaway wrote: KK What people are seeing now must be some other problem that I wan't KK able to reproduce. KK KK Once I hear back

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Pawel Jakub Dawidek
On Wed, Apr 26, 2006 at 04:36:17PM +0300, Kostik Belousov wrote: + On Wed, Apr 26, 2006 at 01:43:42PM +0400, Dmitry Morozovsky wrote: + On Tue, 25 Apr 2006, Kris Kennaway wrote: + + KK What people are seeing now must be some other problem that I wan't + KK able to reproduce. + KK + KK Once

Re: fsck_ufs locked in snaplk

2006-04-26 Thread secmgr
Chris Dillon wrote: I had problems with snapshots and hangs in 5.x. For that, a daily reboot would keep the problems at bay. I upgraded to 6.0 and the problems completely disappeared. I kept 6.0-STABLE running for weeks. Somewhere along the line, as 6.1 approached, similar problems

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Kostik Belousov
On Wed, Apr 26, 2006 at 06:42:28PM +0200, Pawel Jakub Dawidek wrote: On Wed, Apr 26, 2006 at 04:36:17PM +0300, Kostik Belousov wrote: + On Wed, Apr 26, 2006 at 01:43:42PM +0400, Dmitry Morozovsky wrote: + On Tue, 25 Apr 2006, Kris Kennaway wrote: + + KK What people are seeing now must be

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Kris Kennaway
On Wed, Apr 26, 2006 at 11:23:43AM -0600, secmgr wrote: Chris Dillon wrote: I had problems with snapshots and hangs in 5.x. For that, a daily reboot would keep the problems at bay. I upgraded to 6.0 and the problems completely disappeared. I kept 6.0-STABLE running for weeks.

Re: fsck_ufs locked in snaplk

2006-04-26 Thread Adrian Wontroba
On Tue, Apr 25, 2006 at 11:46:03PM +0200, Torfinn Ingolfsen wrote: It could also be viewed as irresponsible to have servers in production _without_ a corresponding test system to test proposed changes on. True, but some us are blessed with a collection of assorted ancient cast off servers, and

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Dmitry Morozovsky
On Mon, 24 Apr 2006, Kris Kennaway wrote: KK KK I'll try to build DDB kernel tomorrow evening to check. Which commands should I KK KK issue in ddb ? KK KK KK KK 'show lockedvnods', 'ps' and 'alltrace' are important. KK KK Last note: are these lines added enough? Or some are unneeded?

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Chris Dillon
Quoting Dmitry Morozovsky [EMAIL PROTECTED]: On Mon, 24 Apr 2006, Kris Kennaway wrote: KK Also you should add DEBUG_LOCKS and DEBUG_VFS_LOCKS on the off chance KK they catch the problem. I got one thought about the source of these hangs/crashes: this machine is the only one with actively

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kostik Belousov
On Tue, Apr 25, 2006 at 08:09:32AM -0500, Chris Dillon wrote: Quoting Dmitry Morozovsky [EMAIL PROTECTED]: On Mon, 24 Apr 2006, Kris Kennaway wrote: KK Also you should add DEBUG_LOCKS and DEBUG_VFS_LOCKS on the off chance KK they catch the problem. I got one thought about the source of

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Chris Dillon
Quoting Kostik Belousov [EMAIL PROTECTED]: I'm going to update to the latest 6.1 code this evening and enable INVARIANTS, WITNESS, and the two DEBUG_LOCKS options to the kernel to see if it catches anything. Please, also add DDB to the kernel and show the result of the show lockedvnodes

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Dmitry Morozovsky
On Tue, 25 Apr 2006, Chris Dillon wrote: CD Please, also add DDB to the kernel and show the result of the CD show lockedvnodes CD alltrace CD ps CD in the DDB after the deadlock, as asked by Kris Kennaway earlier CD in this thread ! CD CD CD OK, I've added DDB, but all of the information

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kostik Belousov
On Tue, Apr 25, 2006 at 07:06:11PM +0400, Dmitry Morozovsky wrote: On Tue, 25 Apr 2006, Chris Dillon wrote: CD Please, also add DDB to the kernel and show the result of the CD show lockedvnodes CD alltrace CD ps CD in the DDB after the deadlock, as asked by Kris Kennaway earlier CD

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Dmitry Morozovsky
On Tue, 25 Apr 2006, Kostik Belousov wrote: KB I just made lab machine with serial console, compile minimal kernel with quotas KB and KDB+WITNESS, and immediately after ``quotacheck /var quotaon /var'' got KB KB kdb_backtrace(d663aba0,c051f402,c05f7da3,c05fe731,c32cb414) at KB

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kostik Belousov
On Tue, Apr 25, 2006 at 08:05:25PM +0400, Dmitry Morozovsky wrote: On Tue, 25 Apr 2006, Kostik Belousov wrote: KB I just made lab machine with serial console, compile minimal kernel with quotas KB and KDB+WITNESS, and immediately after ``quotacheck /var quotaon /var'' got KB KB

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 06:39:09PM +0300, Kostik Belousov wrote: Obviously, revisions 1.78, 1.79 of the sys/ufs/ufs/ufs_quota.c shall be MFCed. Try this patch (note, I does not tested it): WTF, I could have sworn I merged that! Yes, this patch is needed. However, I don't think it's the cause

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 08:09:32AM -0500, Chris Dillon wrote: Quoting Dmitry Morozovsky [EMAIL PROTECTED]: On Mon, 24 Apr 2006, Kris Kennaway wrote: KK Also you should add DEBUG_LOCKS and DEBUG_VFS_LOCKS on the off chance KK they catch the problem. I got one thought about the source of

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Dmitry Morozovsky
On Tue, 25 Apr 2006, Kris Kennaway wrote: KK OK, I wish you and others had responded to my call for testing a month KK or more ago :) All (both) of the responses indicated that the quota KK problems had been fixed following changes made then. At this point it KK may be too late for 6.x, but

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 09:43:14PM +0400, Dmitry Morozovsky wrote: On Tue, 25 Apr 2006, Kris Kennaway wrote: KK OK, I wish you and others had responded to my call for testing a month KK or more ago :) All (both) of the responses indicated that the quota KK problems had been fixed following

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Mike Jakubik
Dmitry Morozovsky wrote: On Tue, 25 Apr 2006, Kris Kennaway wrote: KK OK, I wish you and others had responded to my call for testing a month KK or more ago :) All (both) of the responses indicated that the quota KK problems had been fixed following changes made then. At this point it KK may be

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 05:02:00PM -0400, Mike Jakubik wrote: Dmitry Morozovsky wrote: On Tue, 25 Apr 2006, Kris Kennaway wrote: KK OK, I wish you and others had responded to my call for testing a month KK or more ago :) All (both) of the responses indicated that the quota KK problems had

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Mike Jakubik
Kris Kennaway wrote: This is true, but I hope you recognise that a good part of the responsibility for this falls on the users when asked to test proposed fixes. If the developers are not aware of remaining problems they can't reasonably be expected to fix them :-) Indeed, but the

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Vlad Skvortsov
Mike Jakubik wrote: Kris Kennaway wrote: This is true, but I hope you recognise that a good part of the responsibility for this falls on the users when asked to test proposed fixes. If the developers are not aware of remaining problems they can't reasonably be expected to fix them :-)

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 05:37:37PM -0400, Mike Jakubik wrote: Kris Kennaway wrote: This is true, but I hope you recognise that a good part of the responsibility for this falls on the users when asked to test proposed fixes. If the developers are not aware of remaining problems they can't

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Torfinn Ingolfsen
On Tue, 25 Apr 2006 17:37:37 -0400 Mike Jakubik [EMAIL PROTECTED] wrote: Indeed, but the developers should also realize that a lot of users have servers in production and can not afford the downtime, or simply don't have the resources to test. I think the developers should also spend a little

Re: fsck_ufs locked in snaplk

2006-04-25 Thread JoaoBR
On Tuesday 25 April 2006 18:37, Mike Jakubik wrote: Kris Kennaway wrote: This is true, but I hope you recognise that a good part of the responsibility for this falls on the users when asked to test proposed fixes. If the developers are not aware of remaining problems they can't

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Atanas
Kris Kennaway said the following on 4/25/06 9:22 AM: On Tue, Apr 25, 2006 at 06:39:09PM +0300, Kostik Belousov wrote: Obviously, revisions 1.78, 1.79 of the sys/ufs/ufs/ufs_quota.c shall be MFCed. Try this patch (note, I does not tested it): WTF, I could have sworn I merged that! Yes, this

Re: fsck_ufs locked in snaplk

2006-04-25 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 06:12:10PM -0700, Atanas wrote: Kris Kennaway said the following on 4/25/06 9:22 AM: On Tue, Apr 25, 2006 at 06:39:09PM +0300, Kostik Belousov wrote: Obviously, revisions 1.78, 1.79 of the sys/ufs/ufs/ufs_quota.c shall be MFCed. Try this patch (note, I does not tested

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Dmitry Morozovsky
On Mon, 24 Apr 2006, Dmitry Morozovsky wrote: DM kKK one of my servers had to be rebooted uncleanly and then I have backgrounded DM KK fsck locked for more than an our in snaplk: DM KK DM KK 742 root 1 -44 1320K 688K snaplk 0:02 0.00% fsck_ufs DM KK DM KK File system

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Kris Kennaway
On Mon, Apr 24, 2006 at 10:04:57PM +0400, Dmitry Morozovsky wrote: On Mon, 24 Apr 2006, Dmitry Morozovsky wrote: DM kKK one of my servers had to be rebooted uncleanly and then I have backgrounded DM KK fsck locked for more than an our in snaplk: DM KK DM KK 742 root 1 -4

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Michael Butler
Dmitry Morozovsky wrote: one of my servers had to be rebooted uncleanly and then I have backgrounded fsck locked for more than an our in snaplk: Given that this system came down uncleanly, have you tried starting up in single-user and manually doing an fsck (without '-p') on the afflicted

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Dmitry Morozovsky
On Mon, 24 Apr 2006, Kris Kennaway wrote: KK What bothers me most is that it is the only machine reproducibly hangs in KK snapshots, and it did not hang before RELENG_5 - RELENG_6 upgrade. Other KK RELENG_6 machines do snapshot backups flawlessly (knock-on-wood!) KK KK Are you quite certain

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Dmitry Morozovsky
On Mon, 24 Apr 2006, Michael Butler wrote: MB Dmitry Morozovsky wrote: MB one of my servers had to be rebooted uncleanly and then I have MB backgrounded fsck locked for more than an our in snaplk: MB MB Given that this system came down uncleanly, have you tried starting up in MB single-user

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 12:24:07AM +0400, Dmitry Morozovsky wrote: On Mon, 24 Apr 2006, Kris Kennaway wrote: KK What bothers me most is that it is the only machine reproducibly hangs in KK snapshots, and it did not hang before RELENG_5 - RELENG_6 upgrade. Other KK RELENG_6 machines

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Dmitry Morozovsky
On Mon, 24 Apr 2006, Kris Kennaway wrote: KK I'll try to build DDB kernel tomorrow evening to check. Which commands should I KK issue in ddb ? KK KK 'show lockedvnods', 'ps' and 'alltrace' are important. Last note: are these lines added enough? Or some are unneeded? options KDB

Re: fsck_ufs locked in snaplk

2006-04-24 Thread Kris Kennaway
On Tue, Apr 25, 2006 at 12:45:08AM +0400, Dmitry Morozovsky wrote: On Mon, 24 Apr 2006, Kris Kennaway wrote: KK I'll try to build DDB kernel tomorrow evening to check. Which commands should I KK issue in ddb ? KK KK 'show lockedvnods', 'ps' and 'alltrace' are important. Last note:

fsck_ufs locked in snaplk

2006-04-23 Thread Dmitry Morozovsky
Colleagues, one of my servers had to be rebooted uncleanly and then I have backgrounded fsck locked for more than an our in snaplk: 742 root 1 -44 1320K 688K snaplk 0:02 0.00% fsck_ufs File system in question is 200G gmirror on SATA. Usually making a snapshot (e.g., for

Re: fsck_ufs locked in snaplk

2006-04-23 Thread Kris Kennaway
On Sun, Apr 23, 2006 at 07:35:37PM +0400, Dmitry Morozovsky wrote: Colleagues, one of my servers had to be rebooted uncleanly and then I have backgrounded fsck locked for more than an our in snaplk: 742 root 1 -44 1320K 688K snaplk 0:02 0.00% fsck_ufs File system in

Re: fsck_ufs locked in snaplk

2006-04-23 Thread Dmitry Morozovsky
On Sun, 23 Apr 2006, Kris Kennaway wrote: kKK one of my servers had to be rebooted uncleanly and then I have backgrounded KK fsck locked for more than an our in snaplk: KK KK 742 root 1 -44 1320K 688K snaplk 0:02 0.00% fsck_ufs KK KK File system in question is 200G