no reiserfs quota in 2.4 yet? 2.4.21-pre4-ac4 says different
Hi Reiserfs team, Today I put a new kernel on a server which has reiserfs and needs quota. I searched for the quota patches (found them in the mail archive) and saw that they are very old: ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20 3 december 2002. They don't apply to a current kernel. I decided to use 2.4.20 with -pre4 and -ac4 patch. 01-quota-v2-2.4.20.diff has this: Quota support CONFIG_QUOTA If you say Y here, you will be able to set per user limits for disk - usage (also called disk quotas). Currently, it works only for the - ext2 file system. You need additional software in order to use quota - support; for details, read the Quota mini-HOWTO, available from + usage (also called disk quotas). Currently, it works for the + ext2, ext3, and reiserfs file system. You need additional software + in order to use quota support (you can download sources from + http://www.sf.net/projects/linuxquota/). For further details, read + the Quota mini-HOWTO, available from http://www.tldp.org/docs.html#howto. Probably the quota support is only useful for multi user systems. If unsure, say N. -ac 4 has this: Quota support CONFIG_QUOTA If you say Y here, you will be able to set per user limits for disk - usage (also called disk quotas). Currently, it works only for the - ext2 file system. You need additional software in order to use quota - support; for details, read the Quota mini-HOWTO, available from + usage (also called disk quotas). Currently, it works for the + ext2, ext3, and reiserfs file system. You need additional software + in order to use quota support (you can download sources from + http://www.sf.net/projects/linuxquota/). For further details, read + the Quota mini-HOWTO, available from http://www.tldp.org/docs.html#howto. Probably the quota support is only useful for multi user systems. If unsure, say N. Because none of the outdated patches apply to -pre4-ac4, and because of the above in -ac4, I thought that a 2.4.21-pre4-ac4 kernel would have quota. This, unfortunately, seems not the case. I have this line in fstab: /dev/md1 /reiserfs noatime,usrquota,grpquota 0 0 and get this error message: reiserfs_getopt: unknown option usrquota My quota tools are fresh, 3.08. Did I do something wrong? The setup worked with patched 2.4.19-rc1, but that one became old and we needed a few more modules. So for now I assume I'm bitten by no-quota-in-current-2.4-yet. If I'm right on that; Is there a reason quota is not in 2.4 yet? It has been stable (for me), and it exists for quite some time now. Did only half of the patches make it to Alan? The CONFIG_QUOTA is misleading. Btw, the faq on namesys.com says: Is quota-support built-in in the vanilla 2.4 kernels for ReiserFS? No, quota support for linux kernels from 2.4 branch are bundled separately and can be obtained from this location. The reason these patches are not included into 2.4 kernel branch is because they implement new quota format and need new quota code too, which is too big of a change for 2.4 series of kernels. Various Linux distributions vendors (ie SuSE) do ship reiserfs-quota enabled kernels, though. The from this location link points to ftp://ftp.suse.com/pub/people/mason/patches/reiserfs/quota-2.4 which contains patches one year old. May I ask, what is the future of quota in reiserfs for the 2.4 kernel? Should I wait for new patches? Try to apply them by hand, or did too much change? Will quota be integrated in the 2.4 kernel soonish? Thanks for your time!
Re: no reiserfs quota in 2.4 yet? 2.4.21-pre4-ac4 says different
On Mon, 2003-02-17 at 12:39, Ookhoi wrote: Hi Reiserfs team, Today I put a new kernel on a server which has reiserfs and needs quota. I searched for the quota patches (found them in the mail archive) and saw that they are very old: ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20 3 december 2002. They don't apply to a current kernel. Well, 2.4.20 is the current kernel ;-) Which kernel do you want them against? I've got patches against 2.4.21-preX in testing here, but not against -ac. They should merge against -ac now more easily, but I haven't had time to really test it. Do you want to try the merge on -ac or would you rather try against 2.4.21-preX -chris
Re: no reiserfs quota in 2.4 yet? 2.4.21-pre4-ac4 says different
Chris Mason wrote (ao): On Mon, 2003-02-17 at 12:39, Ookhoi wrote: Today I put a new kernel on a server which has reiserfs and needs quota. I searched for the quota patches (found them in the mail archive) and saw that they are very old: ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20 3 december 2002. They don't apply to a current kernel. Well, 2.4.20 is the current kernel ;-) Which kernel do you want them against? I've got patches against 2.4.21-preX in testing here, but not against -ac. They should merge against -ac now more easily, but I haven't had time to really test it. Do you want to try the merge on -ac or would you rather try against 2.4.21-preX Thanks a lot for your quick answer! Yes, you are of course right that 2.4.20 is the current kernel. My mistake. I would love to use -ac, so patches against that would be great. But you would make me very happy with patches against -pre too. Is there any chance that you consider your against-ac patches ready for inclusion in Alan's kernel patches?
Re: Corrupted/unreadable journal: reiser vs. ext3
Vitaly Fertman wrote: Ok, so the reiserfs kernel code detects an error on disk, what does it do? Print out an error message, maybe BUG? There is an error field in the reiserfs superblock, I hope it is set when the kernel detects something bad. So, now what happens? Maybe the user doesn't read their syslog and doesn't see the error, or the error is just a prelude to memory corruption which causes the system to crash. When the system boots again, it goes on its merry way, mounting the reiserfs filesystem with _known_ errors on it, using bad allocation bitmaps, directories btrees, etc and maybe double allocating blocks or overwriting blocks from other files causing them to become corrupt, etc, etc, etc. Until finally the filesystem is totally corrupt, the system crashes miserably, the user emails this list and reiserfsck has an impossible job trying to fix the filesystem. Instead, what I propose is to have reiserfsck -a AS A STARTING POINT simply check for a valid reiserfs superblock and the absence of the error flag before declaring the filesystem clean and allowing the system to boot. What's even worse, the reiserfs_read_super (at least 2.4.18 RH kernel) code OVERWRITES the superblock error status at mount time, making it worse than useless, since each mount hides any errors that were detected before the crash: s-u.reiserfs_sb.s_mount_state = SB_REISERFS_STATE(s); s-u.reiserfs_sb.s_mount_state = REISERFS_VALID_FS ; Andreas seems reasonable, Vitaly, what are your thoughts? Next, add journal replay to reiserfsck if it isn't already there, Why, when it is in the kernel? Because that is the next stage to allowing reiserfsck do checks on the filesystem after a crash. Do you tell me you would rather (and you must, because it obviously currently does) have reiserfsck just throw away everything in the journal, leaving possibly inconsistent data in the filesystem for it to check? Or maybe make the user mount the filesystem (which obviously has problems or they wouldn't be running reiserfsck to do a full check) just to clear out the journal and maybe risk crashing or corruption if the filesystem is strangely corrupted? Vitaly, answer this. Ok, so probably we should make the following changes. The kernel set IO_ERROR and FS_ERROR flags. In the case of IO_ERROR reiserfsck prints the message about hardware problems and returns error, so the fs does not get mounted at boot. On attempt mounting the fs with IO_ERROR flag set it is mounted ro with some message about hardware problems. When you are sure that problems disappeared you can mount it with a spetial option cleaning this flag and probably reiserfstune will have some option cleaning these flags also. In the case of FS_ERROR - search_by_key failed or beyond end of device access or similar - reiserfsck gets -a option at boot, replays the journal if needed and checks for the flag. No flag - returns OK. Else - run fix-fixable. Errors left - returns 'errors left uncorrected' and the fs does not get mounted at boot. On attempt mounting the fs with the flag just print the message about mounting the fs with errors and mount it. Not ro here as kernel will not do deep analysis of errors and it could be just a small insignificant error. Sounds good to me. Do it. Reiser4 also. -- Hans
What is [PATCH] 02-directio-fix.diff (namesys.com) for?
Hi! Is this patch from 030213 it needed by anyone using ReiserFS within 2.4.20 and 2.4.21-preX ? What is DIRECT IO with reiserfs from the topic line of the patch: # reiserfs: Fix DIRECT IO interference with tail packing ? Thanks for the info and best regards, Manuel (I hope I didn't miss any hidden announcement...)
Re: What is [PATCH] 02-directio-fix.diff (namesys.com) for?
On Mon, 2003-02-17 at 15:55, Manuel Krause wrote: Hi! Is this patch from 030213 it needed by anyone using ReiserFS within 2.4.20 and 2.4.21-preX ? What is DIRECT IO with reiserfs from the topic line of the patch: # reiserfs: Fix DIRECT IO interference with tail packing ? It fixes a bug where a recently unpacked tail might race to the disk with bytes modified via DIRECT IO. The common way to trigger the bug is via a mixture of direct io and regular file access at the same time. Most people won't see the bug, since it is uncommon to mix regular and direct io that way. -chris
Error - Partition Correspondance [was Re: Corrupted/unreadablejournal: reiser vs. ext3]
On 02/17/2003 08:43 PM, Hans Reiser wrote: Vitaly Fertman wrote: Ok, so the reiserfs kernel code detects an error on disk, what does it do? Print out an error message, maybe BUG? There is an error field in the reiserfs superblock, I hope it is set when the kernel detects something bad. So, now what happens? Maybe the user doesn't read their syslog and doesn't see the error, or the error is just a prelude to memory corruption which causes the system to crash. When the system boots again, it goes on its merry way, mounting the reiserfs filesystem with _known_ errors on it, using bad allocation bitmaps, directories btrees, etc and maybe double allocating blocks or overwriting blocks from other files causing them to become corrupt, etc, etc, etc. Until finally the filesystem is totally corrupt, the system crashes miserably, the user emails this list and reiserfsck has an impossible job trying to fix the filesystem. Instead, what I propose is to have reiserfsck -a AS A STARTING POINT simply check for a valid reiserfs superblock and the absence of the error flag before declaring the filesystem clean and allowing the system to boot. What's even worse, the reiserfs_read_super (at least 2.4.18 RH kernel) code OVERWRITES the superblock error status at mount time, making it worse than useless, since each mount hides any errors that were detected before the crash: s-u.reiserfs_sb.s_mount_state = SB_REISERFS_STATE(s); s-u.reiserfs_sb.s_mount_state = REISERFS_VALID_FS ; Andreas seems reasonable, Vitaly, what are your thoughts? Next, add journal replay to reiserfsck if it isn't already there, Why, when it is in the kernel? Because that is the next stage to allowing reiserfsck do checks on the filesystem after a crash. Do you tell me you would rather (and you must, because it obviously currently does) have reiserfsck just throw away everything in the journal, leaving possibly inconsistent data in the filesystem for it to check? Or maybe make the user mount the filesystem (which obviously has problems or they wouldn't be running reiserfsck to do a full check) just to clear out the journal and maybe risk crashing or corruption if the filesystem is strangely corrupted? Vitaly, answer this. Ok, so probably we should make the following changes. The kernel set IO_ERROR and FS_ERROR flags. In the case of IO_ERROR reiserfsck prints the message about hardware problems and returns error, so the fs does not get mounted at boot. On attempt mounting the fs with IO_ERROR flag set it is mounted ro with some message about hardware problems. When you are sure that problems disappeared you can mount it with a spetial option cleaning this flag and probably reiserfstune will have some option cleaning these flags also. In the case of FS_ERROR - search_by_key failed or beyond end of device access or similar - reiserfsck gets -a option at boot, replays the journal if needed and checks for the flag. No flag - returns OK. Else - run fix-fixable. Errors left - returns 'errors left uncorrected' and the fs does not get mounted at boot. On attempt mounting the fs with the flag just print the message about mounting the fs with errors and mount it. Not ro here as kernel will not do deep analysis of errors and it could be just a small insignificant error. Sounds good to me. Do it. Reiser4 also. Hi! BTW, do the ReiserFS errors nowadays print out a usable partition identification (like Chris actual data-logging patches perform at mount, e.g.)? I mostly always have 2 partitions with ReiserFS mounted, so -- is it still meaningless to get an error message related to one of them in my logs? [For long times now (more than 6 months) I did not get any ReiserFS errors any more even with data-logging and preempt-kernel applied -- I only read them over the list. So I don't know the real meaning of error messages' variables content any more... :-( or really :-))) ] I posted this circumstance some 3.6-ReiserFS levels ago and someone of your team wanted to implement this after his task-list was done, IIRC. So, if it's not implemented explicitly in words so far, this would seem to me to be valuable for users, too, IMO. Best regards, Manuel
lexicographic ordering is not always best
Suppose that you have small directories, such that the time to do linear searching within the directory is not significant. Suppose that you have a tendency to access files in readdir() order, and having files laid out in an order that is the same as the directory order is performance valuable. Suppose that you create files too slowly for allocate on flush to fix this problem, and access them too soon for the repacker to fix this problem. In that case, ordering both directory entries and file bodies in a first created first ordered order is optimal. How much work would it be to create a reiser4 directory plugin to order in creation time order? Could you do this by simply setting the hash field always to zero for that plugin, and letting the duplicate key code handle things? If it is trivial to do, it might be useful. Especially for the analysis of the performance of our algorithms on various benchmarks. Are you ready to work on implementing file body key assignment in order of directory entries? -- Hans
Re: Corrupted/unreadable journal: reiser vs. ext3
On Feb 14, 2003 22:19 +0300, Hans Reiser wrote: Andreas Dilger wrote: You are well aware that the e2fsck check intervals can be tuned per-filesystem and even disabled if desired (it prints options for how to do this at mke2fs time and is clearly documented for the experienced user). For a boot-once-a-day machine, the default is to check about once a month (at most 6 months for the time check), and if machines are crashing more often, then they should probably be checked more often because _something_ has to be causing crashes. The idea that how often you boot determines how often it checks is just silly, sorry. I guess the shortcoming in the ext2 case is that it counts mounts and not crashes. If it were counting the number of times the filesystem was uncleanly shut down instead of normal shutdowns, would that be more acceptable? The reason I'm still interested in crashes, even if they are not filesystem-related crashes, is because there had to be _something_ which caused a crash (bad code, bad hardware, whatever), and once you have any driver corrupting memory the chance that it is also corrupting filesystem memory exists. Having reiserfsck just do read-only checks shouldn't force you to type yes (and we mean yes because this is so scary, mere mortals shouldn't be doing this). Hans, you've always talked about making things easy for the average user (error messages and such), don't you think that making a data consistency check for the user a little less intimidating too? I think that you should have to agree that you have time to wait for fsck before you get stuck with a 1 day large server fsck. That is definitely true. However, my assumption would be that if someone is running a system with terabytes of data they will read the man page after waiting a day for fsck to complete, or lose their job. It is entirely possible for administrators to disable the per-mount e2fsck checking, and the time-based (6 months by default) checking too, and do fsck themselves. My experience would be that, like backups, people don't do that, so leaving the 6 month check in protects users from themselves. The other thing to keep in mind is that you can have different levels of automated fsck at boot time, depending on how long they take. You never necessarily have to try and fix anything with fsck -a, just detect errors and leave it up to the user to decide what to do if you find a problem: - always recover journal, validate superblock, error flag ( 1s) Don't know how long it takes these things to run, so it is up to you to trade off checks vs. speed, and you could even round-robin them (storing the last checked item in the superblock or something): - check block allocation bitmaps match superblock counts - walk directory structure from root, checking for directory corruption - check btree validity on inodes for up to 10 seconds (or whatever, storing last checked inode in superblock for restarting this test at next one) By all means, don't do checks for an hour, or allow users to set the maximum boot check duration in the superblock. I'm sure users don't mind waiting 5s at boot time if it means they don't lose data. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/
Re: Error - Partition Correspondance [was Re: Corrupted/unreadable journal: reiser vs. ext3]
Hello! On Tue, Feb 18, 2003 at 12:35:23AM +0100, Manuel Krause wrote: BTW, do the ReiserFS errors nowadays print out a usable partition identification (like Chris actual data-logging patches perform at mount, e.g.)? Sometimes it does. I mostly always have 2 partitions with ReiserFS mounted, so -- is it still meaningless to get an error message related to one of them in my logs? It depends on what are the messages. I posted this circumstance some 3.6-ReiserFS levels ago and someone of your team wanted to implement this after his task-list was done, IIRC. Yes. I have a patch dated back to May 7th, 2002. But it was never accepted for reason I don't remember already. I will dig through my email, though. Probably I will give it another try. Bye, Oleg
fsck on boot (was: Re: Corrupted/unreadable journal: reiser vs. ext3)
Andreas Dilger wrote (ao): The other thing to keep in mind is that you can have different levels of automated fsck at boot time, depending on how long they take. You never necessarily have to try and fix anything with fsck -a, just detect errors and leave it up to the user to decide what to do if you find a problem: - always recover journal, validate superblock, error flag ( 1s) Don't know how long it takes these things to run, so it is up to you to trade off checks vs. speed, and you could even round-robin them (storing the last checked item in the superblock or something): - check block allocation bitmaps match superblock counts - walk directory structure from root, checking for directory corruption - check btree validity on inodes for up to 10 seconds (or whatever, storing last checked inode in superblock for restarting this test at next one) By all means, don't do checks for an hour, or allow users to set the maximum boot check duration in the superblock. I'm sure users don't mind waiting 5s at boot time if it means they don't lose data. Yes! Yes! I agree so much on this .. Let fsck always run at boot, and perform checks which take at most a few seconds all together. Then dmesg will tell if something is wrong. Maybe it can also show the error code in /proc/mounts ?