Re: Quotas aand 2.5.x
Hi Hans, On Tue, 14 Jan 2003 21:06:40 +0300 Hans Reiser [EMAIL PROTECTED] wrote: | Philippe, am I right in guessing that you don't need it urgently? If | so, I will let Chris do it, and it might take just a bit longer. Actually, right now, we still have that nasty bug every time we run quotacheck that prenvent us from enabling them on several filerservers which is a big problem right now (that's on 2.4.x) As i thought that now most of efforts are put into 2.5.x, i might be able to take advantage of all the new functionnalities but what i do need is Reiserfs + NFS + quota. On the NFS part , there have been so many enhancements in 2.5.x (and on other parts as well) that i'm really eager to give it a try. Could Chris tell me then how much time does he think it will take to have quota in 2.5.x. I think it would be best for everybody to wait for Chris to implement quotas in 2.5.x so that people at Namesys can focus on Reiserfs v4 :o) What do you think ? Thanks, Philippe
Re: Quotas aand 2.5.x
Hello! On Wed, Jan 15, 2003 at 10:43:06AM +0100, Philippe Gramoull? wrote: Actually, right now, we still have that nasty bug every time we run quotacheck that prenvent us from enabling them on several filerservers which is a big problem right now (that's on 2.4.x) Have you tried 2.4. without Chris' datalogging patched, but with original short overflow fix? Bye, Oleg
Re: Core dump in reiserfsck
Vitaly Porotikov wrote: Folks, Perhaps 3.x1c have not this bug, but I can't do any changes in my system. I send it in hope to find some coding errors out (if this wasn't before). [root@tkprzbkup5 /root]# mount /dev/rza1 on / type ext2 (rw) none on /proc type proc (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) /dev/rze1 on /export type reiserfs (ro) [root@tkprzbkup5 /root]# reiserfsck --check /dev/rze1 -reiserfsck, 2002- reiserfsprogs 3.x.1b Will read-only check consistency of the filesystem on /dev/rze1 Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes):Yes ### reiserfsck --check started at Wed Jan 15 10:41:40 2003 ### Filesystem seems mounted read-only. Skipping journal replay.. Checking S+tree../ 72 (of 98)/101 (of 153)/124 (of 156)bad_indirect_item: block 1907661: item 7659550 76756x1 IND (1), len 8, location 3056 entry count 0, fsck need 0, format new has a pointer 0 to the block 1 whichin tree already bad_indirect_item: block 1907661: item 7659550 7675670 0x1 IND (1), len 8, location 3056 entry count 0, fsckd 0, format new has a pointer 1 to the block 500 which is in tree already bit 2535358884, bitsize 314620004 reiserfsck: bitmap.c:160: reiserfs_bitmap_test_bit: Assertion `bit_number bm-bm_bit_size' failed. Aborted (core dumped) -- Best regards, Vitaly Use more recent version please. ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.6.4.tar.gz -- Yury Umanets
Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
Hello! On Wed, Jan 15, 2003 at 03:35:59AM +0100, Bernhard Sadlowski wrote: I am using the attached stess.sh script (probably from this mailinglist) for creating load on a reiserfs filesystem, which forks 100 (read,write,delete) processes: # mkreiserfs /dev/sda4 # mount /dev/sda4 /backup # stress.sh -c /usr -n 100 /backup Then wait until /backup fills up. Hm. This resembles me something. Can you reproduce the same problem if you apply patches from ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20/ These patches add quota support to reiserfs, but also change some new inode-related operation to prevent deadlocks like you are seeing. Any I/O freezes and even after killing the script, the remaining cp and mv commands don't terminate. They are in status D. A simle ls /backup never comes back. Only a hard powerdown fixes this situation, because init 6 etc. doesn't work. I have even activated the reiserfs debug, but I don't see any additional info. Try executing sysrq-t after the lockup happens, then send us decoded output plese. Thank you. Bye, Oleg
Re: Quotas aand 2.5.x
On Wed, 15 Jan 2003 12:46:29 +0300 Oleg Drokin [EMAIL PROTECTED] wrote: | Hello! | | On Wed, Jan 15, 2003 at 10:43:06AM +0100, Philippe Gramoull? wrote: | | Actually, right now, we still have that nasty bug every time we run quotacheck that prenvent us from enabling them on several filerservers which is a big problem right now (that's on 2.4.x) | | Have you tried 2.4. without Chris' datalogging patched, but with original short | overflow fix? Well, my question was more like a Plan B. I think i did, and that it still crashed, always during the quotacheck, but i'll try it again to be 100% sure. Thanks, Philippe
Re: Quotas aand 2.5.x
Hello! On Wed, Jan 15, 2003 at 11:59:53AM +0100, Philippe Gramoull? wrote: | Have you tried 2.4. without Chris' datalogging patched, but with original short | overflow fix? Well, my question was more like a Plan B. I think i did, and that it still crashed, always during the quotacheck, but i'll try it again to be 100% sure. The 2.4.19-presomething you had there before with just only fix I sent first time? Bye, Oleg
Re: Quotas aand 2.5.x
On Wed, 15 Jan 2003 14:03:36 +0300 Oleg Drokin [EMAIL PROTECTED] wrote: | The 2.4.19-presomething you had there before with just only fix I sent first | time? hmm, not the 2.4.19-presomething IIRC, it was a later kernel. I'll retry with the 2.4.19pre6 and come back to you. Thx, Philippe
Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
Hello! On Wed, Jan 15, 2003 at 12:51:00PM +0100, Bernhard Sadlowski wrote: Hm. This resembles me something. Can you reproduce the same problem if you apply patches from ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20/ These patches add quota support to reiserfs, but also change some new inode-related operation to prevent deadlocks like you are seeing. The unpatched kernel shows the hangs much earlier, so I assume that the above patches solve the problem. With the patches the load goes up very slowly but steady to 100 and I/O does not freeze anymore. vmstat and iostat still show activity. I assume you don't need any sysrq-t output now. Ok. That's a good sign. Will the patches be included in 2.4.21? No, they require quota support tha won't be included into 2.4 because of new quota formats and stuff. I will extract relevant bits from the patch though. I will send you short version without quota once it will be ready. Thank you. Bye, Oleg
Re: Quotas aand 2.5.x
On Wed, 2003-01-15 at 04:43, Philippe Gramoullé wrote: Hi Hans, On Tue, 14 Jan 2003 21:06:40 +0300 Hans Reiser [EMAIL PROTECTED] wrote: | Philippe, am I right in guessing that you don't need it urgently? If | so, I will let Chris do it, and it might take just a bit longer. Actually, right now, we still have that nasty bug every time we run quotacheck that prenvent us from enabling them on several filerservers which is a big problem right now (that's on 2.4.x) Could you please list the revisions and patches you have testing the overflow fix? As i thought that now most of efforts are put into 2.5.x, i might be able to take advantage of all the new functionnalities but what i do need is Reiserfs + NFS + quota. On the NFS part , there have been so many enhancements in 2.5.x (and on other parts as well) that i'm really eager to give it a try. 2.5.x has many enhancements and many bugs. I strongly recommend against using it for any type of production work, since the development path isn't stable. By this I mean bug fixes might be mixed with other large changes that break things. Could Chris tell me then how much time does he think it will take to have quota in 2.5.x. Honestly I'm not sure. I'm trying for a minimal data logging port to get the journaling improvements in, even the -o data=journal isn't immediately supported. -chris
Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
Hello! On Wed, Jan 15, 2003 at 02:58:04PM +0300, Oleg Drokin wrote: I will extract relevant bits from the patch though. I will send you short version without quota once it will be ready. Ok, here is the patch, can you give it a try and see if it also helps? I tested it locally and it works for me. If you confirm everything is ok, I will try to get it into 2.4.21 in time. Bye, Oleg --- linux-2.4.20/fs/reiserfs/namei.cFri Nov 29 02:53:15 2002 +++ linux-2.4.20-t/fs/reiserfs/namei.c Wed Jan 15 17:08:20 2003 @@ -488,27 +488,58 @@ return 0; } +/* quota utility function, call if you've had to abort after calling +** new_inode_init, and have not called reiserfs_new_inode yet. +** This should only be called on inodes that do not hav stat data +** inserted into the tree yet. +*/ +static int drop_new_inode(struct inode *inode) { +make_bad_inode(inode) ; +iput(inode) ; +return 0 ; +} + +/* utility function that does setup for reiserfs_new_inode. +** DQUOT_ALLOC_INODE cannot be called inside a transaction, so we had +** to pull some bits of reiserfs_new_inode out into this func. +*/ +static int new_inode_init(struct inode *inode, struct inode *dir, int mode) { + +/* the quota init calls have to know who to charge the quota to, so +** we have to set uid and gid here +*/ +inode-i_uid = current-fsuid; +inode-i_mode = mode; + +if (dir-i_mode S_ISGID) { +inode-i_gid = dir-i_gid; +if (S_ISDIR(mode)) +inode-i_mode |= S_ISGID; +} else +inode-i_gid = current-fsgid; +return 0 ; +} + static int reiserfs_create (struct inode * dir, struct dentry *dentry, int mode) { int retval; struct inode * inode; -int windex ; int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2 ; struct reiserfs_transaction_handle th ; - if (!(inode = new_inode(dir-i_sb))) { return -ENOMEM ; } +retval = new_inode_init(inode, dir, mode) ; +if (retval) + return retval ; + journal_begin(th, dir-i_sb, jbegin_count) ; th.t_caller = create ; -windex = push_journal_writer(reiserfs_create) ; -inode = reiserfs_new_inode (th, dir, mode, 0, 0/*i_size*/, dentry, inode, retval); -if (!inode) { - pop_journal_writer(windex) ; - journal_end(th, dir-i_sb, jbegin_count) ; - return retval; +retval = reiserfs_new_inode (th, dir, mode, 0, 0/*i_size*/, dentry, inode); +if (retval) { + goto out_failed ; } inode-i_op = reiserfs_file_inode_operations; @@ -520,20 +551,19 @@ if (retval) { inode-i_nlink--; reiserfs_update_sd (th, inode); - pop_journal_writer(windex) ; - // FIXME: should we put iput here and have stat data deleted - // in the same transactioin journal_end(th, dir-i_sb, jbegin_count) ; - iput (inode); - return retval; + iput(inode) ; + goto out_failed ; } reiserfs_update_inode_transaction(inode) ; reiserfs_update_inode_transaction(dir) ; d_instantiate(dentry, inode); -pop_journal_writer(windex) ; journal_end(th, dir-i_sb, jbegin_count) ; return 0; + +out_failed: +return retval ; } @@ -541,21 +571,21 @@ { int retval; struct inode * inode; -int windex ; struct reiserfs_transaction_handle th ; int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3; if (!(inode = new_inode(dir-i_sb))) { return -ENOMEM ; } +retval = new_inode_init(inode, dir, mode) ; +if (retval) +return retval ; + journal_begin(th, dir-i_sb, jbegin_count) ; -windex = push_journal_writer(reiserfs_mknod) ; -inode = reiserfs_new_inode (th, dir, mode, 0, 0/*i_size*/, dentry, inode, retval); -if (!inode) { - pop_journal_writer(windex) ; - journal_end(th, dir-i_sb, jbegin_count) ; - return retval; +retval = reiserfs_new_inode(th, dir, mode, 0, 0/*i_size*/, dentry, inode); +if (retval) { + goto out_failed; } init_special_inode(inode, mode, rdev) ; @@ -571,16 +601,17 @@ if (retval) { inode-i_nlink--; reiserfs_update_sd (th, inode); - pop_journal_writer(windex) ; journal_end(th, dir-i_sb, jbegin_count) ; - iput (inode); - return retval; + iput(inode) ; +goto out_failed; } d_instantiate(dentry, inode); -pop_journal_writer(windex) ; journal_end(th, dir-i_sb, jbegin_count) ; return 0; + +out_failed: +return retval ; } @@ -588,15 +619,18 @@ { int retval; struct inode * inode; -int windex ; struct reiserfs_transaction_handle th ; int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3; +mode = S_IFDIR | mode; if (!(inode = new_inode(dir-i_sb))) { return -ENOMEM ; } +retval = new_inode_init(inode, dir, mode) ; +if (retval) + return retval ; + journal_begin(th, dir-i_sb, jbegin_count) ; -windex =
Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
On 15 Jan 2003 18:01, Oleg Drokin [EMAIL PROTECTED] wrote: Ok, here is the patch, can you give it a try and see if it also helps? I tested it locally and it works for me. If you confirm everything is ok, I will try to get it into 2.4.21 in time. At first glance it seems to work. I will run now that script overnight and will tell you, if any problems arise. Thanks, Bernhard
Re: Stress test failure for reiserfs, but ext3 ok on Linux 2.4.20
Hello! On Wed, Jan 15, 2003 at 04:48:52PM +0100, Bernhard Sadlowski wrote: Ok, here is the patch, can you give it a try and see if it also helps? I tested it locally and it works for me. If you confirm everything is ok, I will try to get it into 2.4.21 in time. At first glance it seems to work. I will run now that script overnight and will tell you, if any problems arise. Ok, Thank you very much. Bye, Oleg
Re: reiserfsck failure
Unfortunatly, the console was completely hosed up and I couldn't even get any display on it to record any messages to stderr. Sorry - don't think I can help there... Bill Oleg Drokin wrote: Hello! On Tue, Jan 14, 2003 at 04:01:42PM -0500, Bill Schrier wrote: I am sending along both the --logfile and the core file from a recent reiserfsck we were running on our Redhat 7.2 raidzone machine. Can you say what exact version was that? Also just before dumping core it should have output some more info on stderr about assertion failure, we are interested in that message too. Thank you. Bye, Oleg -- William J. Schrier Phone: 412.968.5780 x151 Neolinear, Inc. Fax: 412.968.5788 583 Epsilon Drive Email: [EMAIL PROTECTED] Pittsburgh, PA 15238
[PATCH] data logging patches available for 2.4.21-preX
Hello all, I've updated the data logging patches to 2.4.21-pre3, once the suse mirror is done updating, you can download them from: ftp.suse.com/pub/people/mason/patches/data-logging/2.4.21 Changes: mount -o remount can switch between data modes. transaction overflow bug fix (BUG in journal_mark_dirty) fix for -ENOSPC hang when the block size page size (ia64, alpha) integration with akpm's b_journal_head code from -ac. Should make it possible for these patches to work both on vanilla kernels and Alan's branch. I haven't tried this yet though, so testers would be appreciated. commit_super is now sync_fs I had to rework some of the data=ordered buffer handling code to merge with the changes in 2.4.21-preX, so use some caution with this code. -chris
Re: How to break a reiserfs on Linux 2.4.20
In article [EMAIL PROTECTED], Nikita Danilov [EMAIL PROTECTED] wrote: Zygo Blaxell writes: In article avprcs$bfi$[EMAIL PROTECTED], Zygo Blaxell [EMAIL PROTECTED] wrote: I think I'm seeing a pattern of failure. ... And now I can reliably reproduce it. It has nothing to do with MD, linear, raid, SMP, or unclean shutdowns. I can reproduce this bug on a plain IDE disk partition in about three hours on Linux 2.4.20 (compiled for SMP but running on UP, full .config and system details available on request). My test system has about 4 gigs under /etc, /usr, and /var, /dev/hdc2 is 25GB, and there is 1G of swap. Thanks for the report. We shall try to reproduce it tonight. Were you successful? If your experience is anything like mine, you should have hundreds if not thousands of broken files by now... -- Zygo Blaxell (Laptop) [EMAIL PROTECTED] GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD
Re: How to break a reiserfs on Linux 2.4.20
Hello! On Wed, Jan 15, 2003 at 05:44:26PM -0500, Zygo Blaxell wrote: And now I can reliably reproduce it. It has nothing to do with MD, linear, raid, SMP, or unclean shutdowns. I can reproduce this bug on a plain IDE disk partition in about three hours on Linux 2.4.20 (compiled for SMP but running on UP, full .config and system details available on request). My test system has about 4 gigs under /etc, /usr, and /var, /dev/hdc2 is 25GB, and there is 1G of swap. Thanks for the report. We shall try to reproduce it tonight. Were you successful? If your experience is anything like mine, you should have hundreds if not thousands of broken files by now... Yes, we were able to reproduce the problem and now we are trying to fix it. Thanks a lot for your help and for the script. Bye, Oleg