no reiserfs quota in 2.4 yet? 2.4.21-pre4-ac4 says different

2003-02-17 Thread Ookhoi
Hi Reiserfs team,

Today I put a new kernel on a server which has reiserfs and needs quota.
I searched for the quota patches (found them in the mail archive) and
saw that they are very old:

ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20

3 december 2002. They don't apply to a current kernel. 

I decided to use 2.4.20 with -pre4 and -ac4 patch. 

01-quota-v2-2.4.20.diff has this:
 Quota support
 CONFIG_QUOTA
   If you say Y here, you will be able to set per user limits for disk
-  usage (also called disk quotas). Currently, it works only for the
-  ext2 file system. You need additional software in order to use quota
-  support; for details, read the Quota mini-HOWTO, available from
+  usage (also called disk quotas). Currently, it works for the
+  ext2, ext3, and reiserfs file system. You need additional software
+  in order to use quota support (you can download sources from
+  http://www.sf.net/projects/linuxquota/). For further details, read
+  the Quota mini-HOWTO, available from
   http://www.tldp.org/docs.html#howto. Probably the quota
   support is only useful for multi user systems. If unsure, say N.


-ac 4 has this:
 Quota support
 CONFIG_QUOTA
   If you say Y here, you will be able to set per user limits for disk
-  usage (also called disk quotas). Currently, it works only for the
-  ext2 file system. You need additional software in order to use quota
-  support; for details, read the Quota mini-HOWTO, available from
+  usage (also called disk quotas). Currently, it works for the
+  ext2, ext3, and reiserfs file system. You need additional software
+  in order to use quota support (you can download sources from
+  http://www.sf.net/projects/linuxquota/). For further details, read
+  the Quota mini-HOWTO, available from
   http://www.tldp.org/docs.html#howto. Probably the quota
   support is only useful for multi user systems. If unsure, say N.

Because none of the outdated patches apply to -pre4-ac4, and because of
the above in -ac4, I thought that a 2.4.21-pre4-ac4 kernel would have
quota.

This, unfortunately, seems not the case.

I have this line in fstab:
/dev/md1  /reiserfs noatime,usrquota,grpquota  0  0

and get this error message:
reiserfs_getopt: unknown option usrquota

My quota tools are fresh, 3.08.

Did I do something wrong? The setup worked with patched 2.4.19-rc1, but
that one became old and we needed a few more modules. So for now I
assume I'm bitten by no-quota-in-current-2.4-yet.

If I'm right on that; Is there a reason quota is not in 2.4 yet? It has
been stable (for me), and it exists for quite some time now. Did only
half of the patches make it to Alan? The CONFIG_QUOTA is misleading.

Btw, the faq on namesys.com says:
 Is quota-support built-in in the vanilla 2.4 kernels for ReiserFS?

No, quota support for linux kernels from 2.4 branch are bundled
separately and can be obtained from this location. The reason these
patches are not included into 2.4 kernel branch is because they
implement new quota format and need new quota code too, which is too big
of a change for 2.4 series of kernels. Various Linux distributions
vendors (ie SuSE) do ship reiserfs-quota enabled kernels, though.

The from this location link points to
ftp://ftp.suse.com/pub/people/mason/patches/reiserfs/quota-2.4 which
contains patches one year old.


May I ask, what is the future of quota in reiserfs for the 2.4 kernel?
Should I wait for new patches? Try to apply them by hand, or did too
much change? Will quota be integrated in the 2.4 kernel soonish?

Thanks for your time!



Re: no reiserfs quota in 2.4 yet? 2.4.21-pre4-ac4 says different

2003-02-17 Thread Chris Mason
On Mon, 2003-02-17 at 12:39, Ookhoi wrote:
 Hi Reiserfs team,
 
 Today I put a new kernel on a server which has reiserfs and needs quota.
 I searched for the quota patches (found them in the mail archive) and
 saw that they are very old:
 
 ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20
 
 3 december 2002. They don't apply to a current kernel. 
 

Well, 2.4.20 is the current kernel ;-)  Which kernel do you want them
against?  I've got patches against 2.4.21-preX in testing here, but not
against -ac.  They should merge against -ac now more easily, but I
haven't had time to really test it.

Do you want to try the merge on -ac or would you rather try against
2.4.21-preX

-chris






Re: no reiserfs quota in 2.4 yet? 2.4.21-pre4-ac4 says different

2003-02-17 Thread Ookhoi
Chris Mason wrote (ao):
 On Mon, 2003-02-17 at 12:39, Ookhoi wrote:
  Today I put a new kernel on a server which has reiserfs and needs
  quota.  I searched for the quota patches (found them in the mail
  archive) and saw that they are very old:
  
  ftp://ftp.namesys.com/pub/reiserfs-for-2.4/testing/quota-2.4.20
  
  3 december 2002. They don't apply to a current kernel. 
 
 Well, 2.4.20 is the current kernel ;-)  Which kernel do you want them
 against?  I've got patches against 2.4.21-preX in testing here, but not
 against -ac.  They should merge against -ac now more easily, but I
 haven't had time to really test it.
 
 Do you want to try the merge on -ac or would you rather try against
 2.4.21-preX

Thanks a lot for your quick answer!

Yes, you are of course right that 2.4.20 is the current kernel. My
mistake.

I would love to use -ac, so patches against that would be great. But you
would make me very happy with patches against -pre too.

Is there any chance that you consider your against-ac patches ready for
inclusion in Alan's kernel patches?



Re: Corrupted/unreadable journal: reiser vs. ext3

2003-02-17 Thread Hans Reiser
Vitaly Fertman wrote:


Ok, so the reiserfs kernel code detects an error on disk, what does it
do?  Print out an error message, maybe BUG?  There is an error field
in the reiserfs superblock, I hope it is set when the kernel detects
something bad.

So, now what happens?  Maybe the user doesn't read their syslog and
doesn't see the error, or the error is just a prelude to memory corruption
which causes the system to crash.  When the system boots again, it goes
on its merry way, mounting the reiserfs filesystem with _known_ errors
on it, using bad allocation bitmaps, directories btrees, etc and maybe
double allocating blocks or overwriting blocks from other files causing
them to become corrupt, etc, etc, etc.  Until finally the filesystem is
totally corrupt, the system crashes miserably, the user emails this list
and reiserfsck has an impossible job trying to fix the filesystem.

Instead, what I propose is to have reiserfsck -a AS A STARTING POINT
simply check for a valid reiserfs superblock and the absence of the
error flag before declaring the filesystem clean and allowing the
system to boot.

What's even worse, the reiserfs_read_super (at least 2.4.18 RH kernel)
code OVERWRITES the superblock error status at mount time, making it
worse than useless, since each mount hides any errors that were detected
before the crash:

	s-u.reiserfs_sb.s_mount_state = SB_REISERFS_STATE(s);
	s-u.reiserfs_sb.s_mount_state = REISERFS_VALID_FS ;
 

Andreas seems reasonable, Vitaly, what are your thoughts?

   

Next, add journal replay to reiserfsck if it isn't already there,
 

Why, when it is in the kernel?
   

Because that is the next stage to allowing reiserfsck do checks on the
filesystem after a crash.  Do you tell me you would rather (and you
must, because it obviously currently does) have reiserfsck just throw
away everything in the journal, leaving possibly inconsistent data in
the filesystem for it to check?  Or maybe make the user mount the
filesystem (which obviously has problems or they wouldn't be running
reiserfsck to do a full check) just to clear out the journal and maybe
risk crashing or corruption if the filesystem is strangely corrupted?
 

Vitaly, answer this.
   


Ok, so probably we should make the following changes. The kernel set IO_ERROR
and FS_ERROR flags. 
In the case of IO_ERROR reiserfsck prints the message about hardware problems 
and returns error, so the fs does not get mounted at boot. On attempt mounting 
the fs with IO_ERROR flag set it is mounted ro with some message about hardware 
problems. When you are sure that problems disappeared you can mount it with a 
spetial option cleaning this flag and probably reiserfstune will have some 
option cleaning these flags also.
In the case of FS_ERROR - search_by_key failed or beyond end of device access 
or similar - reiserfsck gets -a option at boot, replays the journal if needed 
and checks for the flag. No flag - returns OK. Else - run fix-fixable. Errors
left - returns 'errors left uncorrected' and the fs does not get mounted at 
boot. On attempt mounting the fs with the flag just print the message about 
mounting the fs with errors and mount it. Not ro here as kernel will not do 
deep analysis of errors and it could be just a small insignificant error.

 

Sounds good to me.  Do it.  Reiser4 also.

--
Hans





What is [PATCH] 02-directio-fix.diff (namesys.com) for?

2003-02-17 Thread Manuel Krause
Hi!

Is this patch from 030213 it needed by anyone using ReiserFS within 
2.4.20 and 2.4.21-preX ?

What is DIRECT IO with reiserfs from the topic line of the patch:
# reiserfs: Fix DIRECT IO interference with tail packing ?

Thanks for the info and best regards,

Manuel

(I hope I didn't miss any hidden announcement...)



Re: What is [PATCH] 02-directio-fix.diff (namesys.com) for?

2003-02-17 Thread Chris Mason
On Mon, 2003-02-17 at 15:55, Manuel Krause wrote:
 Hi!
 
 Is this patch from 030213 it needed by anyone using ReiserFS within 
 2.4.20 and 2.4.21-preX ?
 
 What is DIRECT IO with reiserfs from the topic line of the patch:
 # reiserfs: Fix DIRECT IO interference with tail packing ?

It fixes a bug where a recently unpacked tail might race to the disk
with bytes modified via DIRECT IO.  The common way to trigger the bug is
via a mixture of direct io and regular file access at the same time.

Most people won't see the bug, since it is uncommon to mix regular and
direct io that way.

-chris





Error - Partition Correspondance [was Re: Corrupted/unreadablejournal: reiser vs. ext3]

2003-02-17 Thread Manuel Krause
On 02/17/2003 08:43 PM, Hans Reiser wrote:

Vitaly Fertman wrote:


Ok, so the reiserfs kernel code detects an error on disk, what does it
do?  Print out an error message, maybe BUG?  There is an error field
in the reiserfs superblock, I hope it is set when the kernel detects
something bad.

So, now what happens?  Maybe the user doesn't read their syslog and
doesn't see the error, or the error is just a prelude to memory 
corruption
which causes the system to crash.  When the system boots again, it goes
on its merry way, mounting the reiserfs filesystem with _known_ errors
on it, using bad allocation bitmaps, directories btrees, etc and maybe
double allocating blocks or overwriting blocks from other files causing
them to become corrupt, etc, etc, etc.  Until finally the filesystem is
totally corrupt, the system crashes miserably, the user emails this 
list
and reiserfsck has an impossible job trying to fix the filesystem.

Instead, what I propose is to have reiserfsck -a AS A STARTING POINT
simply check for a valid reiserfs superblock and the absence of the
error flag before declaring the filesystem clean and allowing the
system to boot.

What's even worse, the reiserfs_read_super (at least 2.4.18 RH kernel)
code OVERWRITES the superblock error status at mount time, making it
worse than useless, since each mount hides any errors that were 
detected
before the crash:

s-u.reiserfs_sb.s_mount_state = SB_REISERFS_STATE(s);
s-u.reiserfs_sb.s_mount_state = REISERFS_VALID_FS ;


Andreas seems reasonable, Vitaly, what are your thoughts?

  

Next, add journal replay to reiserfsck if it isn't already there,


Why, when it is in the kernel?
  

Because that is the next stage to allowing reiserfsck do checks on the
filesystem after a crash.  Do you tell me you would rather (and you
must, because it obviously currently does) have reiserfsck just throw
away everything in the journal, leaving possibly inconsistent data in
the filesystem for it to check?  Or maybe make the user mount the
filesystem (which obviously has problems or they wouldn't be running
reiserfsck to do a full check) just to clear out the journal and maybe
risk crashing or corruption if the filesystem is strangely corrupted?


Vitaly, answer this.
  


Ok, so probably we should make the following changes. The kernel set 
IO_ERROR
and FS_ERROR flags. In the case of IO_ERROR reiserfsck prints the 
message about hardware problems and returns error, so the fs does not 
get mounted at boot. On attempt mounting the fs with IO_ERROR flag set 
it is mounted ro with some message about hardware problems. When you 
are sure that problems disappeared you can mount it with a spetial 
option cleaning this flag and probably reiserfstune will have some 
option cleaning these flags also.
In the case of FS_ERROR - search_by_key failed or beyond end of device 
access or similar - reiserfsck gets -a option at boot, replays the 
journal if needed and checks for the flag. No flag - returns OK. Else 
- run fix-fixable. Errors
left - returns 'errors left uncorrected' and the fs does not get 
mounted at boot. On attempt mounting the fs with the flag just print 
the message about mounting the fs with errors and mount it. Not ro 
here as kernel will not do deep analysis of errors and it could be 
just a small insignificant error.

 

Sounds good to me.  Do it.  Reiser4 also.


Hi!

BTW, do the ReiserFS errors nowadays print out a usable partition 
identification (like Chris actual data-logging patches perform at mount, 
e.g.)?

I mostly always have 2 partitions with ReiserFS mounted, so -- is it 
still meaningless to get an error message related to one of them in my logs?

[For long times now (more than 6 months) I did not get any ReiserFS 
errors any more even with data-logging and preempt-kernel applied -- I 
only read them over the list. So I don't know the real meaning of error 
messages' variables content any more... :-( or really :-)))   ]

I posted this circumstance some 3.6-ReiserFS levels ago and someone of 
your team wanted to implement this after his task-list was done, IIRC.

So, if it's not implemented explicitly in words so far, this would seem 
to me to be valuable for users, too, IMO.


Best regards,

Manuel



lexicographic ordering is not always best

2003-02-17 Thread Hans Reiser
Suppose that you have small directories, such that the time to do linear 
searching within the directory is not significant.

Suppose that you have a tendency to access files in readdir() order, and 
having files laid out in an order that is the same as the directory 
order is performance valuable.

Suppose that you create files too slowly for allocate on flush to fix 
this problem, and access them too soon for the repacker to fix this problem.

In that case, ordering both directory entries and file bodies in a first 
created first ordered order is optimal.

How much work would it be to create a reiser4 directory plugin to order 
in creation time order?  Could you do this by simply setting the hash 
field always to zero for that plugin, and letting the duplicate key code 
handle things?  If it is trivial to do, it might be useful.  Especially 
for the analysis of the performance of our algorithms on various benchmarks.

Are you ready to work on implementing file body key assignment in order 
of directory entries?

--
Hans




Re: Corrupted/unreadable journal: reiser vs. ext3

2003-02-17 Thread Andreas Dilger
On Feb 14, 2003  22:19 +0300, Hans Reiser wrote:
 Andreas Dilger wrote:
  You are well aware
 that the e2fsck check intervals can be tuned per-filesystem and even
 disabled if desired (it prints options for how to do this at mke2fs time
 and is clearly documented for the experienced user).  For a boot-once-a-day
 machine, the default is to check about once a month (at most 6 months for
 the time check), and if machines are crashing more often, then they should
 probably be checked more often because _something_ has to be causing crashes.
 
 The idea that how often you boot determines how often it checks is just 
 silly, sorry.

I guess the shortcoming in the ext2 case is that it counts mounts and
not crashes.  If it were counting the number of times the filesystem
was uncleanly shut down instead of normal shutdowns, would that be more
acceptable?  The reason I'm still interested in crashes, even if they
are not filesystem-related crashes, is because there had to be _something_
which caused a crash (bad code, bad hardware, whatever), and once you have
any driver corrupting memory the chance that it is also corrupting filesystem
memory exists.

 Having reiserfsck just do read-only checks shouldn't force you to type
 yes (and we mean yes because this is so scary, mere mortals shouldn't
 be doing this).  Hans, you've always talked about making things easy for
 the average user (error messages and such), don't you think that making
 a data consistency check for the user a little less intimidating too?

 I think that you should have to agree that you have time to wait for 
 fsck before you get stuck with a 1 day large server fsck.

That is definitely true.  However, my assumption would be that if someone
is running a system with terabytes of data they will read the man page
after waiting a day for fsck to complete, or lose their job.  It is entirely
possible for administrators to disable the per-mount e2fsck checking, and
the time-based (6 months by default) checking too, and do fsck themselves.
My experience would be that, like backups, people don't do that, so leaving
the 6 month check in protects users from themselves.

The other thing to keep in mind is that you can have different levels of
automated fsck at boot time, depending on how long they take.  You never
necessarily have to try and fix anything with fsck -a, just detect errors
and leave it up to the user to decide what to do if you find a problem:
- always recover journal, validate superblock, error flag ( 1s)

Don't know how long it takes these things to run, so it is up to you to
trade off checks vs. speed, and you could even round-robin them (storing
the last checked item in the superblock or something):
- check block allocation bitmaps match superblock counts
- walk directory structure from root, checking for directory corruption
- check btree validity on inodes for up to 10 seconds (or whatever, storing
  last checked inode in superblock for restarting this test at next one)

By all means, don't do checks for an hour, or allow users to set the maximum
boot check duration in the superblock.  I'm sure users don't mind waiting
5s at boot time if it means they don't lose data.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/




Re: Error - Partition Correspondance [was Re: Corrupted/unreadable journal: reiser vs. ext3]

2003-02-17 Thread Oleg Drokin
Hello!

On Tue, Feb 18, 2003 at 12:35:23AM +0100, Manuel Krause wrote:

 BTW, do the ReiserFS errors nowadays print out a usable partition 
 identification (like Chris actual data-logging patches perform at mount, 
 e.g.)?

Sometimes it does.

 I mostly always have 2 partitions with ReiserFS mounted, so -- is it 
 still meaningless to get an error message related to one of them in my logs?

It depends on what are the messages.

 I posted this circumstance some 3.6-ReiserFS levels ago and someone of 
 your team wanted to implement this after his task-list was done, IIRC.

Yes. I have a patch dated back to May 7th, 2002. But it was never
accepted for reason I don't remember already.
I will dig through my email, though. Probably I will give it another try.

Bye,
Oleg



fsck on boot (was: Re: Corrupted/unreadable journal: reiser vs. ext3)

2003-02-17 Thread Ookhoi
Andreas Dilger wrote (ao):
 The other thing to keep in mind is that you can have different
 levels of automated fsck at boot time, depending on how long they
 take.  You never necessarily have to try and fix anything with fsck
 -a, just detect errors and leave it up to the user to decide what to
 do if you find a problem: - always recover journal, validate
 superblock, error flag ( 1s)
 
 Don't know how long it takes these things to run, so it is up to you
 to trade off checks vs. speed, and you could even round-robin them
 (storing the last checked item in the superblock or something):
 - check block allocation bitmaps match superblock counts
 - walk directory structure from root, checking for directory
   corruption
 - check btree validity on inodes for up to 10 seconds (or whatever,
   storing last checked inode in superblock for restarting this test at
   next one)
 
 By all means, don't do checks for an hour, or allow users to set the
 maximum boot check duration in the superblock.  I'm sure users don't
 mind waiting 5s at boot time if it means they don't lose data.

Yes! Yes! I agree so much on this .. Let fsck always run at boot, and
perform checks which take at most a few seconds all together.

Then dmesg will tell if something is wrong. Maybe it can also show the
error code in /proc/mounts ?